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field theory textbooks, but often they tend to peter out just as the fun gets going. Here 
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e John Preskill, Lectures on Quantum Field Theory 


Preskill’s beautiful and comprehensive lectures on quantum field theory are the closest 
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lines in the ’t Hooft limit?) This book is a collection of preprints, prefaced by some 
brief remarks. Still, the originals are well worth the read. 


e Yitzhak Frishman and Cobi Sonnenschein, Non-Perturbative Field Theory: From 
Two Dimensional Conformal Field Theory to QCD in Four Dimensions 


The goal of this book is similar to these lectures but the itinerary is run in reverse, 
starting in two dimensions and building up to four. 


e Eduardo Fradkin, Field Theories in Condensed Matter Physics 
e Shankar, Quantum Field Theory and Condensed Matter 


Both of these books discuss quantum field theory in condensed matter physics. Much 
of the material is restricted to field theories in d = 1+1 and d = 2+1 dimensions, and 
so useful for Sections 7 and 8. But the general approach to understanding the phase 
structure and behaviour of field theories should resonate. 


Lecture notes on various topics discussed in these lectures can be downloaded from 
the course webpage. 
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0. Introduction 


Towards the end of the day, as feathers droop and hearts flutter from too much flapping, 
it is not unusual to find flocks of birds resting on high voltage wires. For someone 
unacquainted with the gauge principle, this may seem like a dangerous act. But birds 
know better. There is no absolute sense in which the voltage of the wire is high. It is 
only high in comparison to the Earth. 


Of the many fillets and random facts that we are fed in high school science classes, 
the story of the birds is perhaps the deepest. Most other ideas from our early physics 
lessons look increasingly antiquated as we gain a deeper understanding of the Universe. 
The concept of “force”, for example, is very 17th century. Yet the curious fact that the 
electrostatic potential does not matter, only the potential difference, blossoms into the 
gauge symmetry which underlies the Maxwell equations, the Standard Model and, in 
the guise of diffeomorphism invariance, general relativity. 


Gauge symmetry is, in many ways, an odd foundation on which to build our best 
theories of physics. It is not a property of Nature, but rather a property of how 
we choose to describe Nature. Gauge symmetry is, at heart, a redundancy in our 
description of the world. Yet it is a redundancy that has enormous utility, and brings 
a subtlety and richness to those theories that enjoy it. 


This course is about the quantum dynamics of gauge theories. It is here that the 
utility of gauge invariance is clearest. At the perturbative level, the redundancy allows 
us to make manifest the properties of quantum field theories, such as unitarity, locality, 
and Lorentz invariance, that we feel are vital for any fundamental theory of physics 
but which teeter on the verge of incompatibility. If we try to remove the redundancy 
by fixing some specific gauge, some of these properties will be brought into focus, 
while others will retreat into murk. By retaining the redundancy, we can flit between 
descriptions as is our want, keeping whichever property we most cherish in clear sight. 


The purpose of this course is not so much to convince you that gauge theories are 
useful, but rather to explore their riches. Even at the classical level they have much 
to offer. Gauge theories are, like general relativity, founded in geometry. They are not 
associated only to the geometry of spacetime, but to a less intuitive and more general 
mathematical construct known as a fibre bundle. This brings something new to the 
table. While most interesting applications of general relativity are restricted to ripples 
of the curved, but topologically flat, spacetime in which we live, gauge fields are more 
supple: they can twist and wind in novel ways, bringing the subject of topology firmly 
into the realm of physics. This will be a dominant theme throughout these lectures. It 


is a theme that becomes particularly subtle when we include fermions in the mix, and 
see how they intertwine with the gauge fields. 


However, the gauge theoretic fun really starts when we fully immerse ourselves in 
the quantum world. The vast majority of gauge theories are strongly coupled quantum 
field theories, where the usual perturbative techniques are insufficient to answer many 
questions of interest. Despite many decades of work, our understanding of this area 
remains rather primitive. Yet this is where the most interesting phenomena occur. In 
particle physics, the strong coupling dynamics of quantum field theory causes quarks 
and gluons to bind into protons, neutrons and other particles. In condensed matter 
physics, it causes electrons, which are indivisible particles, to fractionalise in high 
magnetic fields. There are even tantalising hints that such dynamics may be responsible 
for the emergence of space and time itself from more fundamental underlying degrees 
of freedom. The focus of these lectures is not on any particular phenomenon (although 
confinement in QCD will be something of a pre-occupation). Rather we will try to 
explain some of the ways in which we can make progress, primitive as it may be, in 
understanding gauge fields when interactions become strong, and quantum fluctuations 
wild. 


1. Topics in Electromagnetism 


We start these lectures by reviewing some topics in Maxwell theory. As we will see, 
there are some beautiful topological surprises hiding in electromagnetism that are not 
usually covered in our first undergraduate lectures. These topics will then follow us 
through these lectures as we explore other examples of gauge theories. 


1.1 Magnetic Monopoles 


A magnetic monopole is an object which emits a radial magnetic field of the form 


bao g [es B=s (1.1) 


Arr? 
Here g is called the magnetic charge. 


We learn as undergraduates that magnetic monopoles don’t exist. First, and most 
importantly, they have never been observed. Second there’s a law of physics which 
insists that they can’t exist. This is the Maxwell equation 


V:-B=0 


Third, this particular Maxwell equation would appear to be non-negotiable. This 
is because it follows from the definition of the magnetic field in terms of the gauge 
potential 


B=VxA > V-B=0 


Yet the gauge potential A is indispensable in theoretical physics. It is needed whenever 
we describe the quantum physics of particles moving in magnetic fields. Underlying 
this statement is the fact that the gauge potential is needed in the classical Hamiltonian 
treatment. Moreover, there are more subtle phenomena such as the Aharonov-Bohm 
effect which tell us that there is further, non-local information stored in the gauge 
potentials. (The Aharonov-Bohm effect was covered in the lectures on Applications of 
Quantum Mechanics.) All of this points to the fact that we would be wasting our time 
discussing magnetic monopoles. 


Happily, there is a glorious loophole in all of these arguments, first discovered by 
Dirac, and magnetic monopoles play a crucial role in our understanding of the more 
subtle effects in gauge theories. The essence of this loophole is that there is an ambiguity 
in how we define the gauge potentials. In this section, we will see how we can exploit 
this. 


1.1.1 Dirac Quantisation 


It turns out that not any magnetic charge g is compatible with quantum mechanics. 
Since this will be important, we will present several different arguments for the allowed 
values of g. 


We start the simplest, and most physical of these arguments. For this we need to 
know a fact from quantum mechanics. Suppose that we take a particle which carries 
electric charge e. We adiabatically transport it along some closed path C in the back- 
ground of some gauge potential A(x,t). Then, upon returning to its initial starting 
position, the wavefunction of the particle picks up a phase 


y — ehy with a= fA ax (1.2) 


There are different ways to see this, but the simplest is from the path integral approach 
to quantum mechanics, where the action for a point particle includes the term f dt ex-A; 
this directly gives the phase above. 


The phase of the wavefunction is not an observable quantity in quantum mechanics. 
However, the phase in (1.2) is really a phase difference. We could, for example, place a 
particle in a superposition of two states, one of which stays still while the other travels 
around the loop C. The subsequent interference will depend on the phase e*®®. Indeed, 
this is the essence of the Aharonov-Bohm effect. 


Let’s now see what this has to do with magnetic monopoles. We place our electric 
particle, with charge e, in the background of a magnetic monopole with magnetic charge 
g. We keep the magnetic monopole fixed, and let the electric particle undergo some 
journey along a path C. We will ask only that the path C avoid the origin where the 
magnetic monopole is sitting. This is shown in the left-hand panel of the figure. Upon 
returning, the particle picks up a phase e“°/” with 


a= f A-dx= / dS-B 
C S 


where, as shown in the figure, S is the area enclosed by C. Using the fact that Jas ds - 
B = g, if the surface S makes a solid angle Q, this phase can be written as 
Qg 
a = — 
4T 


‘6 


Figure 1: Integrating over S... Figure 2: ...or over S”. 


However, there’s an ambiguity in this computation. Instead of integrating over S, it 
is equally valid to calculate the phase by integrating over S’, shown in the right-hand 
panel of the figure. The solid angle formed by S’ is Q!’ = 4r — Q. The phase is then 
given by 


1 (4r — Q)g 
i An 


where the overall minus sign comes because the surface S’ has the opposite orientation 
to S. As we mentioned above, the phase shift that we get in these calculations is 
observable: we can’t tolerate different answers from different calculations. This means 


that we must have e%*/" = ee'/h This gives the condition 


eg=2rhn  withneZ (1.3) 


This is the famous Dirac quantisation condition. The smallest such magnetic charge is 
also referred to as the quantum of flux, Po = 2ah/e. 


Above we worked with a single particle of charge e. Obviously, the same argument 
holds for any other particle of charge e’. There are two possibilities. The first is that 
all particles carry charge that is an integer multiple of some smallest unit. In this case, 
it’s sufficient to impose the Dirac quantisation condition (1.3) where e is the smallest 
unit of charge. For example, in our world we should take e to be the electron charge. 
(You might want to insist that monopoles carry a larger magnetic charge so that they 
are consistent with quarks which have one third the electron charge. However, it turns 
out this isn’t necessary if the monopoles also carry colour magnetic charge.) 


The second possibility is that the particles carry electric charges which are irrational 
multiples of each other. For example, there may be a particle with charge e and another 
particle with charge v2e. In this case, no magnetic monopoles are allowed. 


It’s sometimes said that the existence of a magnetic monopole would imply the 
quantisation of electric charges. This, however, has it slightly backwards. (It also 
misses the point that we have a beautiful explanation of the quantisation of charges 
from anomaly cancellation in the Standard Model; we will tell this story in Section 
3.4.4.) Instead, the key distinction is the choice of Abelian gauge group. A U(1) gauge 
group has only integer electric charges and admits magnetic monopoles. In contrast, a 
gauge group R can have any irrational charges, but the price you pay is that there are 
no longer monopoles. 


Above we looked at an electrically charged particle moving in the background of 
a magnetically charged particle. It is simple to generalise the discussion to particles 
that carry both electric and magnetic charges. These are called dyons. For two dyons, 
with charges (e1, g1) and (e2, g2), the generalisation of the Dirac quantisation condition 
requires 


€192 — €291 € 2rhZ (1.4) 
This is sometimes called the Dirac-Zwanziger condition. 


1.1.2 A Patchwork of Gauge Fields 


The discussion above shows how quantum mechanics constrains the allowed values of 
magnetic charge. It did not, however, address the main obstacle to constructing a 
magnetic monopole out of gauge fields A when the condition B = V x A would seem 
to explicitly forbid such objects. 


Let’s see how to do this. Our goal is to write down a configuration of gauge fields 
which give rise to the magnetic field (1.1) of a monopole which we will place at the 
origin. We will need to be careful about what we want such a gauge field to look like. 


The first point is that we won’t insist that the gauge field is well defined at the origin. 
After all, the gauge fields arising from an electron are not well defined at the position of 
an electron and it would be churlish to require more from a monopole. This fact gives 
us our first bit of leeway, because now we need to write down gauge fields on R®\{0}, 
as opposed to R? and the space with a point cut out enjoys some non-trivial topology 
that we will make use of. 


Now consider the following gauge connection, written in spherical polar coordinates 


1 — cos 0 
paa (5) 


Arr sin 


The resulting magnetic field is 


1 Oo 10 ; 
B= A = —— — (AÙ sin) ê — -—(rA”)0 
va rsind 00 poms oe $) 
Substituting in (1.5) gives 


A 


gr 
B= 2 (1.6) 


In other words, this gauge field results in the magnetic monopole. But how is this 
possible? Didn’t we learn as undergraduates that if we can write B = V x A then 
f dS-B=0? How does the gauge potential (1.5) manage to avoid this conclusion? 


The answer is that A” in (1.5) is actually a singular gauge connection. It’s not just 
singular at the origin, where we’ve agreed this is allowed, but it is singular along an 
entire half-line that extends from the origin to infinity. This is due to the 1/sin@ term 
which diverges at 0 = 0 and 6 = m. However, the numerator 1 — cos 0 has a zero when 
0 = 0 and the gauge connection is fine there. But the singularity along the half-line 
0 = m remains. The upshot is that this gauge connection is not acceptable along the 
line of the south pole, but is fine elsewhere. This is what the superscript N is there to 
remind us: this gauge connection is fine as long as we keep north. 


Now consider a different gauge connection 


g 1+ cosé 
Arr sind 


Ag = (1.7) 
This again gives rise to the magnetic field (1.6). This time it is well behaved at 6 = 7, 
but singular at the north pole 0 = 0. The superscript S is there to remind us that this 
connection is fine as long as we keep south. 


At this point, we make use of the ambiguity in the gauge connection. We are going 
to take AÙ in the northern hemisphere and A*® in the southern hemisphere. This is 
allowed because the two gauge potentials are the same up to a gauge transformation, 
A —> A+ Vw. Recalling the expression for Vw in spherical polars, we find that for 
0 £ 0,7, we can indeed relate AX and AS by a gauge transformation, 


Ogw where w= ge (1.8) 


1 
AN = A® 
$ aF QT 


ind 


However, there’s still a question remaining: is this gauge transformation allowed? The 
problem is that the function w is not single valued: w(@ = 27) = w(@ = 0) +g. Should 


this concern us? 


To answer this, we need to think more carefully about what we require from a gauge 
transformation. This is where the charged matter comes in. In quantum mechanics, 
the gauge transformation acts on the wavefunction of the particle as 


w ay eha 


In quantum field theory, we have the same transformation but now with w interpreted 
as the field. We will not require that the gauge transformation w is single-valued, but 
only that the wavefunction w is single-valued. This holds for the gauge transformation 
(1.8) provided that we have 


eg = 2rħn withne Z 
This, of course, is the Dirac quantisation condition (1.3). 


Mathematically, this is a construction of a topologically non-trivial U (1) bundle over 
the S? surrounding the origin. In this context, the integer n is called the first Chern 
number. 


1.1.3 Monopoles and Angular Momentum 


Here we provide yet another derivation of the Dirac quantisation condition, this time 
due to Saha. The key idea is that the quantisation of magnetic charge actually follows 
from the more familiar quantisation of angular momentum. The twist is that, in the 
presence of a magnetic monopole, angular momentum isn’t quite what you thought. 


Let’s start with some simple classical mechanics. The equation of motion for a 
particle of mass m and charge e and position r, moving in a magnetic field B, is the 
familiar Lorentz force law 


d 
"P erxB 


dt 


with p = mr the mechanical momentum. If you remember the Hamiltonian formalism 
for a particle in a magnetic field, you might recall that p is not the canonical momentum, 
a fact which is hiding in the background in what follows. Now let’s consider this 
equation in the background of a magnetic monopole, with 


The monopole has rotational symmetry so we would expect that the angular momen- 
tum, r x p, is conserved. Let’s check: 


d(r x p) 


T =ixp+rxps=rxp=erx (tx B) 


= a Gt) 


We see that in the presence of a magnetic monopole, the naive angular momentum r x p 
is not conserved! However, we can easily write down a modified angular momentum 
that is conserved, namely! 
eg. 
L=rxp-—r 
p 4T 
The extra term can be thought of as the angular momentum stored in E x B. The 


surprise is that the particle has angular momentum even if it doesn’t move! 


Before we move on, there’s a nice and quick corollary that we 
can draw from this. The angular momentum vector L does not 
change with time. But the angle that the particle makes with this 
vector is 


L-r= Se constant 
At 


This means that the particle moves on a cone, with axis L and angle 
cos@ = —eg/4r L. 


So far, our discussion has been classical. Now we invoke some Figure 3: 
simple quantum mechanics: the angular momentum should be quantised. In particular, 
the angular momentum in the z-direction should be L, € IAZ. Using the result above, 
we have 

C pe => eg =?2rħn with ne Z 
4r 2 


Once again, we find the Dirac quantisation condition. 


On Bosons and Fermions 


There is an interesting factor of 2 buried in the discussion above. Consider a minimal 
Dirac monopole, with g = 2rħ/e. In the background of this monopole, we will throw 
in a particle of spin S. The total angular momentum J is then 


1 
J=L+S=rxp+S- 5 (1.9) 


The key observation is that the final term, due to the monopole, shifts the total angular 
momentum by 1/2. That means, in the presence of a monopole, bosons have half-integer 
angular momentum while fermions have integer angular momentum! We’ll not need 
this curious fact for most of these lectures, but it will return in Section 8.6 when we 
discuss some surprising dualities in d = 2 + 1 quantum field theories. 


'We also noticed this in the lecture notes on Classical Dynamics; see Section 4.3.2. 


1.2 The Theta Term 


In relativistic notation, the Maxwell action for electromagnetism takes a wonderfully 
compact form, 


1 
=> | d F” Fy 4, (Op? — RB? Ll 
S Maxwell = 1j d'z -f dx (5 2m ( 0) 


Here Fuy = 0,A, — O,A, and F; = cho; and Fij = —éijr Bp. 


One reason that the Maxwell action is so simple is that there is very little else we 
can write down that is both gauge invariant and Lorentz invariant. There are terms 
of order ~ F4 and higher, which give rise to non-linear electrodynamics, but these will 
always be suppressed by some high mass scale and are unimportant at low-energies. 


There is, however, one other term that we can add to the Maxwell action that, at 
first glance, would seem to be of equal importance. A second glance then shows that 
it is completely unimportant and it’s on the third glance that we see the role it plays. 
This is the theta term. 


We start by defining the dual tensor 
XUV 1 pV po 
F = 9° Foo 


*F”” takes the same form as the original electromagnetic tensor F», but with E/c + B 
and B > —E/c. The theta term is then given by 


be? 


be? 
a [PP Fw =p | 47 E-B 1.11 
aa fe a 4r?hħe oe ( ) 


where 0 is a parameter. The morass of constants which accompany it ensure, among 
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other things, that 0 is dimensionless; we will have more to say about this in Section 
1.2.4. Like the original Maxwell term, the theta term is quadratic in electric and 
magnetic fields. However, it is simple to check that the theta term can be written as a 
total derivative, 


be? 


a= 8r?ħh 


f d*z u(t” A, 0p Ao) (1.12) 


We say that the theta term is topological. It depends only on boundary information. 
Another way of saying this is that we don’t need to use the spacetime metric to define 
the theta term; we instead use the volume form #7. The upshot is that the theta 
term does not change the equations of motion and, it would seem, can have little effect 
on the physics. 


= ]0 = 


As we will now see, this latter conclusion is a little rushed. There are a number of 
situations in which the theta term does lead to interesting physics. These situations 
often involve subtle interplay between quantum mechanics and topology. 

Axion Electrodynamics 


We start by looking at situations where 0 affects the dynamics classically. This occurs 
when @ is not constant, but instead varies in space and, possibly, time: 0 = 0(x,t). In 
general, the action governing the electric and magnetic field is given by 


1 e? 
= 4 _ > pY ATW 
S fa z ( JE uv + IGA 0(x, t)*F Fw) 


The equations of motion from this action read 


ac 1 OE Q f; 
V-E=-V0-B and -35 +V xB= Č (6B+V0x E) (1.13) 
where 
1 e 


“= Areo Rc 


is the dimensionless fine structure constant. It takes the approximate value a = 1/137. 
The deformed Maxwell equations are sometimes referred to as the equations of axion 
electrodynamics. The name is slightly misleading; an axion is what you get if you 
promote 0 to a new dynamical field. Here we’re considering it to be some fixed back- 
ground. They are accompanied by the usual Bianchi identities, 0,*F"” = 0, which 
remain unchanged 


V-B=0 and OB gaei 
ot 
The equations (1.13) carry much — although not all — of the new physics. The first 
tells us that in regions of space where @ varies, a magnetic field B acts like an electric 
charge density. The second tells us that the combination (6B + V8 x E) acts like a 
current density. 


1.2.1 The Topological Insulator 


There are a fascinating class of materials, known as topological insulators, whose dy- 
namics is characterised by the fact that 0 = m. (We’ll see what’s special about the 
value 6 = v in Section 1.2.4.) Examples include the Bismuth compounds Bi2Se3 and 
BigT ez. 


s= 


6=0 6=0 
— 
=c >E 
= j} 
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Figure 4: Applying a magnetic field Figure 5: Applying an electric field 


Consider a topological insulator, with 6 = 7, filling (most of) the lower-half plane, 
z < —e. We fill (most of) the upper-half plane, z > e, with the vacuum which has 
6 = 0. In the intermediate region z € [—e, €] we have 0,0 Æ 0. 


Let’s first shine a magnetic field B, = B on this interface from below, as shown in 
the left-hand panel of the figure. The first equation in (1.13) tells us that a there is an 
effective accumulation of charge density, p = ac(0.0)B/a. The surface charge per unit 
area is given by 


o= f dz p=acB 
This surface charge will give rise to an electric field outside the topological insulator. 
We learn that the boundary of a topological insulator has rather striking properties: it 
takes a magnetic field inside and generates an electric field outside! 


Alternatively, we can turn on an electric field which lies tangential to the interface, 
say E = E. This is shown in the right-hand panel of the figure. The second equation 
in (1.13) tells us that, in the regime where 0,0 4 0, the electric field acts as a surface 
current K, lying within the interface, perpendicular to E, 


K, = aeqck, (1.14) 


This, in turn, then generates a magnetic field outside the topological insulator, per- 
pendicular to both E and K. This, again, is shown in the right-hand panel of the 
figure. 


The creation of a two-dimensional current which lies perpendicular to an applied 
electric field is called the Hall effect. The coefficient of proportionality is known as the 
Hall conductivity and there is a long and beautiful story about how it takes certain 
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very special values which are rational multiples of e?/27h. (More details can be found 
in the lecture notes on the Quantum Hall Effect.) In the present example (1.14), the 
Hall conductivity is 


o1 e? 
Ory = rh 


This is usually abbreviated to say that the interface of the topological insulator has 
Hall conductivity 1/2. 


The general phenomenon in which electric fields induce magnetic fields and vice versa 
goes by the name of the topological magneto-electric effect. 


Continuity Conditions 


There’s a slightly different, but equivalent way of describing the physics above. It 
doesn’t tell us anything new, but it does make contact with the language we previously 
used to describe electrodynamics in materials?. We introduce the electric displacement 


Deg (e+ sB) 
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Comparing to the usual expression for D, we see that, in a topological insulator, a 
magnetic field B acts like polarisation. When @ varies, we have a varying polarisation, 
resulting in bound charge. This is what we saw in the topological insulator interface 
above. Similarly, we define the magnetising field 


1 
fs (B = 24E) 
Ho TC 
We see in a topological insulator, E acts like magnetisation. When @ varies, we get a 
varying magnetisation which results in bound currents. 


With these definitions, the equations of axion electrodynamics (1.13) take the usual 
form of the Maxwell equations in matter 
oD 
V-D=0 and VxH--—=0 
ot 
Now we can use the standard arguments (involving Gaussian pillboxes and line in- 
tegrals) that tell us B perpendicular to a surface and E tangential to a surface are 


2See Section 7 of the lectures on Electromagnetism. 
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necessarily continuous. This means that if we introduce the normal vector to the sur- 
face n, then 


n-AB=0 and nx AE=0 (1.15) 


For a usual dielectric, D perpendicular to a surface and H parallel to a surface are 
both discontinuous, with the discontinuity given by the surface charge and current 
respectively. Here, we’ve absorbed the -induced surface charges and currents into the 
definition of D and H. If there are no further, external charges we have 


n-AD=0 and nx AH=0 (1.16) 


It is simple the check that this condition reproduces the topological magneto-electric 
results that we described above. 


1.2.2 A Mirage Monopole 


Let’s continue to explore the physics of interface between the vacuum (filling z > 0) and 
a topological insulator (filling z < 0). Here’s a fun game to play: take an electric charge 
q and place it in the vacuum at point x = (0,0,d), a distance d above a topological 
insulator. What do the resulting electric and magnetic fields look like? 


We can answer this using the continuity conditions described above, together with 
the idea of an image charge. (We met the image charge in the Electromagnetism lecture 
notes when discussing metals. One can also use the same tricks to describe the electric 
field in the presence of a dielectric, which is closer in spirit to the calculation here.) As 
always with the method of images, we need a flash of insight to write down an ansatz. 
(Or, equivalently, someone to tell us the answer). However, if we find a solution that 
works then general results about the uniqueness of boundary-value problems ensure 
that this is the unique condition. 


In the present case, the answer is quite cute: we will see that if we sit in the vacuum 
z > 0, the electric and magnetic field lines are those due to the original particle at 
x = (0,0,d), together with a mirror dyon sitting at x = (0,0,—d) with electric and 
magnetic charges (q', g). Meanwhile, if we sit in the topological insulator, z < 0, the 
electric and magnetic field lines are those due to the original particle, now superposed 
with those arising from a mirror dyon with charges (q’, —g), also sitting at x = (0,0, d). 
Note that in both cases, the dyon is a mirage: it sits outside of the region we have 
access to. If we try to reach it by crossing the boundary, it switches the other side! 
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g= @ (444-8) 
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Figure 6: i) An electric charge q placed near a topological insulator. ii) The resulting electric 
and magnetic field lines as seen outside. iii) The field lines as seen inside. 


To see that this is the correct answer (and to compute q’ and g), we work with scalar 
potentials. It’s familiar to use the electrostatic equation V x E = 0 to write E = —V¢. 
The electric potential in the two regions is 


= 1 q + g z>0 
Areo \ Ja? +y? + (2-A)? Ja? +y? + (z +d) 
and 
1 / 
o= q¥4 z<0 
ATeo \/x? + y? + (z -— d)? 
Note that E, = —0,¢ and Ey = —0,@ are both continuous at the interface z = 0, as 


required by (1.15). In contrast, Æ, will be discontinuous; we’ll look at this shortly. 


For the magnetic field, this is one of the few occasions where it’s useful to work with 
the magnetic scalar potential. This means that we use the fact that V x B = 0 to 


write B = —VQ. (Recall the warning from earlier lectures: unlike the electric scalar ¢, 
there is nothing fundamental about Q; it is merely a useful computational trick). We 
then have 
1 
= g z>0 
An \/x2 + y? + (z +d)? 
and 
1 
> 2 z<0 
An \/x2 + y? + (z —d)? 
Note that B, = —0, is continuous across the plane z = 0, as required by the condition 
(1:15): 
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Let’s now look at the (dis)continuity conditions (1.16). From the expressions above, 
we have 
q-q___d 
z=0t+ An (x? + y?)3/2 


= eo lh, 


and 


at q + acegg d 
te 4r (a2 + y2)3?2 


Equating these tells us that the magnetic charge on the image dyon is 


2¢q/ 


= 1.17 
QCEo ae, 
Similarly, the magnetic field tangent to the interface is 
il g d 
H, SAB. = 
z=0} po “le=0+  4rpo (x? + y?)3/? 
and 
m| _ => (B:-*E)|___ =-= eet _- 4 
z=0- Ho c z=0- Ar Ho (x? + y?)3/2 
which gives us 
(q+q)a 
= 1.18 
9 Oe, (1.18) 


Happily we have found a solution both (1.15) and (1.16) can be satisfied across the 
boundary. Uniqueness means that this must be the correct solution. As we have seen, 
it involve mirage dyons sitting beyond our reach. From (1.17) and (1.18), we learn that 
the electric and magnetic charges carried by these dyons are given by 

j a 4 2a 

= ————-q an = 

1 da 7 (4+ a2)cey 4 
The monopoles and dyons that arise in this way are a mirages. Experimentally, we’re 
in the slightly unusual situation where we can see mirage monopoles, but not real 
monopoles! 


1.2.3 The Witten Effect 


There is also an interesting story to tell about genuine magnetic monopoles. As we 
now show, the effect of the 0 term is to endow the magnetic monopole with an electric 
charge. This is known as the Witten effect. 
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It’s simplest to frame the set-up by first taking a 


magnetic monopole with magnetic charge g and placing Omon 


it inside a vacuum, with 0 = 0. We then surround this 
with a medium that has 6 4 0 as shown in the figure. We 
know what happen from our discussion above. When the 


magnetic field crosses the interface where 6 changes, it 
will induce an electric charge. This charge follows from 


the first equation in (1.13). From inside the medium 
when 0 4 0, it looks as if the monopole has electric 
charge 


Figure 7: 
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—— az 1.19 
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Note, however, that this result is independent of the size of the interior region is where 
0 = 0. We could shrink this region down until it is infinitesimally small, and we still 
find that the monopole has charge g. The correct interpretation of this is that when 
0 £0, a monopole is, in fact, a dyon: it carries electric charge (1.19). 


When the monopole carries the minimum allowed magnetic charge, its electric charge 
is given by 
2rh e0 
g=— > 4557 
e 27 
In particular, if we place a magnetic monopole inside a topological insulator, it turns 
into a dyon which carries half the charge of the electron. 


Note that if we take 0 = 27 then the electric charge of the monopole coincides with 
that of the electron; in this case, we can construct a neutral monopole by considering 
a bound state of the dyon + positron. However, when @ is not a mutliple of 27, all 
monopoles necessarily carry electric charge. 


One might wonder why we had to introduce the region with 0 = 0 at all. What 
happens if we simply insist that we place the monopole directly in a system with 
6 £0? You would again discover the Witten effect, but now you have to be careful 
about the boundary conditions you can place on the gauge field at the origin. We won’t 
describe this here. We will, however, give a slightly different derivation. Consider, once 
again first, placing a monopole in a medium with 0 = 0. This time we will very slowly 
we increase 0. (Don’t ask me how...I don’t know! We just imagine it’s possible.) The 
second equation in (1.13) contains a Ê term which tells us that this will be accompanied 
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by a time-varying electric field which lies parallel to B. At the end of this process, the 
final electric field will be 


Once again, we learn that the monopole carries an electric charge given by (1.19). 


We’ll see various other manifestations of the Witten effect as these lectures progress 
including, in Section 2.8.2, for monopoles in non-Abelian gauge theories. 


1.2.4 Why @ is Periodic 


In classical axion electrodynamics, 0 can take any value. Indeed, as we have seen, it is 
only spatial and temporal variations of 0 that play a role. However, in the quantum 
theory @ is a periodic variable: it lies in the range 


6 € (0, 27) 


This is the real reason why 0 was accompanied by that mess of other constants multi- 
plying the action; it is to ensure that the periodicity is something natural. 


The periodicity of 0 in electrodynamics is actually fairly subtle. It hinges on the 
topology of the U(1) gauge fields. We’ll see that, after imposing appropriate boundary 
conditions, Sg can only take values of the form 


Sp=hON with NEZ (1.20) 


This means that the theta angle contributes to the partition function as 


p (3 ) _ ine 


The factor of 2 here is all important. In Minkowski signature, the action always sits 
with a factor of i. However, one of the special things about the theta term is that it 
has only a single time derivative in the integrand, a fact which can be traced to the 
appearance of the e“”°" anti-symmetric tensor. This means that the factor of i persists 
even in Euclidean signature. Since N is an integer, we see that the value of 0 in the 
partition function is only important modulo 27. 


So our task is to show that, when evaluated on any field configuration, Sg must take 
the form (1.20). The essence of the argument follows from the fact that the theta term 
is a total derivative (1.12), which shows us that the value of Sg depends only on the 
boundary condition. To exploit the topology of lurking in the U(1) gauge field, we will 
work on a compact Euclidean spacetime which we take to be T*. We’ll take each of 
the circles in the torus to have radii R. 
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We’ll make life even easier for ourselves by restricting to the special case E = (0,0, Æ) 
and B = (0,0, B) with E and B constant. The integral that we’re interested in is 


| dfx EB= | d'ae f dx'dz? B (1.21) 
T4 T2 T? 


This still looks like it can take any value we like. But we need to recall that E and 
B are not the fundamental fields; these are the gauge fields A,,. And these must be 
well defined on the underlying torus. As we’ll now show, this puts restrictions on the 
allowed values of E and B. 


First, we need the following result: when a direction of space, say x!, is periodic with 
radius R, then the constant part of the corresponding gauge field (also known as the 
zero mode) also becomes periodic with radius 


h 
A, =A — 1.22 
1 1+ RP ( ) 


This arises because the presence of a circle allows us to do something interesting with 
gauge transformations A; —> A; + w. As in Section 1.1.2, we do not insist that w(x) 


iew/ħ ig single-valued, since this is what 


is single valued. Instead, we require only that e 
acts on the wavefunction. This allows us to perform gauge transformations that wind 
around the circle, such as 

oxh 

= eR 


These are sometimes called large gauge transformations, a name which reflects the 
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fact that they cannot be continuously deformed to the identity. Under such a gauge 
transformation, we see that 


h 
Ay > Ay w = A + =; 
eR 


But field configurations that are related by a gauge transformation are to be viewed as 
physically equivalent. We learn that the constant part of the gauge field is periodically 
identified as (1.22) as claimed. 


Now let’s see how this fact restricts the allowed values of the integral (1.21). The 
magnetic field is written as 


B = 01 Ag —_ On A, 


We can work in a gauge where A, = 0, so that B = Ao. If we want a uniform, 
constant B then we need to write Ag = Bx!. This isn’t single valued. However, that 
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needn’t be a problem because, as we’ve seen above, Ag is actually a periodic variable 
with periodicity h/eR. This means that we’re perfectly at liberty to write Ap = Br, 
but only if this has the correct period (1.22). This holds provided 

hn 


Qh 
B = —— with nEZ => dxtdxz? B = -i 
2reR? T2 e 


(1.23) 


Note that this is the same as the condition on f dS - B = g that we derived from the 
Dirac quantisation condition (1.3). Indeed, the derivation above relies on the same 
kind of arguments that we used when discussing magnetic monopoles. 


We can now apply exactly the same argument to the electric field, 


E 
PE OpA3 — öz Ao 


Let’s work in a gauge with Ao = 0, so that E/c = 0) A3. We can write A3 = (E/c)2°, 
which is compatible with the periodicity of Aa only when E/c = hn'/27eR? for some 
n' € Z. We find 

| dx?dz? E = L (1.24) 

T2 e 

Before we go on, let me point out something that may be confusing. You may have 
thought that the relevant equation for E is Gauss’ law which, given the quantisation of 
charge, states that f dS - E = en’ for some n’ € Z. But that’s not what we computed 
in (1.24) because E = (0,0, E) lies parallel to the side of the torus, not perpendicular. 
Instead, both (1.23) and (1.24) are best thought of as integrating the 2-form Fy over 
the appropriate T°. For the magnetic field, this coincides with f dS-B which measures 
the magnetic charge enclosed in the manifold. It does not, however, coincide with 
J dS -E which measures the electric charge. 


Armed with (1.23) and (1.24), we see that, at least for this specific example, 


An? h?cN 
| ire B-A => S=ħh0N with N=nn' €Z 
T4 € 


which is our promised result (1.20) 


The above explanation was rather laboured. It’s pretty straightforward to generalise 
it to non-constant E and B fields. If you’re mathematically inclined, it is the statement 
that the second Chern number of a U (1) bundle is integer valued and, as we have seen 
above, is actually equal to the product of two first Chern numbers. Finally note that, 
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although we took Euclidean spacetime to be a torus T4, the end result does not depend 
on the volume of the torus which is set by R. Nonetheless, the introduction of the 
torus was crucial in our argument: we needed the circles of T* to exploit the fact that 
I,(U(1)) = Z. We will see another derivation of this when we come to discuss the 
anomaly in Section 3.3.1. 


1.2.5 Parity, Time-Reversal and 6 = m 


The theta term does not preserve the same symmetries as the Maxwell term. It is, 
of course, gauge invariant and Lorentz invariant. But it is not invariant under certain 
discrete symmetries. 


The discrete symmetries of interest are parity P and time reversal invariance 7. 
Parity acts by flipping all directions of space 


P : x -x (1.25) 


(At least this is true in any odd number of spatial dimensions; in an even number of 
spatial dimensions, this is simply a rotation.) Meanwhile, as the name suggests, time 
reversal flips the direction of time 


T: t> -t 


We would like to understand how these act on the electric and magnetic fields. This 
follows from looking at the Lorentz force law, 


mx =e(E+x-B) 


This equation is invariant under neither parity, nor time reversal. However it can be 
made invariant if we simultaneously act on both E and B as 


P : E(x,t) => —E(-x,t) and P: B(x,t) 4» B(-x,?) 
and 
T : E(x, t) > E(x,-t) and 7: B(x,t) —B(x, —t) 


We say that E is odd under parity and even under time reversal; B is even under parity 
and odd under time reversal. 


As an aside, note that a high energy theorist usually refers to CP rather than 7. 
Here C is charge conjugation which acts as C : E œ> —E and C : B+> —B, with the 
consequence that CP : E > E and CP : B+> —B, rather like 7. However, there is a 
difference between the two symmetries: CP is unitary, while 7 is anti-unitary. 
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This means that, in general, the theta term breaks both parity and time-reversal 
invariance. We say that 6 ++ —0 under P and 7. There are two exceptions. One of 
these is obvious: when 0 = 0, the theory is invariant under these discrete symmetries. 
However, when 0 = z the theory is also invariant. This is because, as we have seen 0 
is periodic so 6 = m is the same as 0 = —7. 


This observation also gives some hint as to why the topological insulator has 6 = ~r. 
These are materials which are defined to be time-reversal invariant. As we have seen, 
there are two possibilities for the dynamics of such materials. (In fancy language, they 
are said to have a Zə classification.) Most materials are boring and have 0 = 0. But 
some materials have a band structure which is twisted in a particular way. This results 
inĝ =. 


1.3 Further Reading 


Anyone who has spent even the briefest time looking into the history of physics will have 
learned one thing: it’s complicated. It’s vastly more complicated than the air-brushed 
version we’re fed as students. A fairly decent summary is: everyone was confused. 
Breakthroughs are made by accident, or for the wrong reason, or lie dormant until long 
after they are rediscovered by someone else. Mis-steps later turn out to be brilliant 
moves. Ideas held sacred by one generation are viewed as distractions by the next. 


Things become even harder when attempting to assign attribution. The scientific 
literature alone does not tell the full story. It misses the conference coffee conversations, 
the petty rivalries, the manoeuvering for glory. It misses the fact that, for most of the 
time, everyone was confused. Gell-Mann, who perhaps did more than anyone to lay 
the groundwork for particle physics and quantum field theory, captures this in an 
uncharacteristically rambling manner [47] 


My whole life was like that and I think many people’s lives are like that. If 
you generalised my errors and the things that I got wrong and the things 
that I didn’t follow up properly and the things that I saw and I did not 
believe in and the things I saw and did not write up. Almost always there 
was some error of level. That is I knew that a certain thing was right and 
felt that it was right and it was contradictory to what I was doing and I 
could not get used to the idea that some things you have to answer late: 
you just put them off, but you answer some of the things now and... 


These lectures are concerned with the theoretical structure of gauge theories. It is 
a subject whose history is inextricably bound with experimental discoveries in particle 
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physics and the development of the Standard Model. Our understanding of gauge 
theory took place slowly, over many decades, and involved many hundreds, if not 
thousands, of physicists. 


Each chapter of these lectures ends with a section in which I offer a broad brush 
account of this history. It is flawed. In places, given the choice between accuracy or 
a good story, I have erred towards a good story. I have, however, included references 
to the original literature. More usefully for students, I have also included references to 
reviews where a number of topics are treated in much greater detail. 


Gauge Symmetry 


These lectures are about gauge symmetry. Although the use of a gauge choice was 
commonplace among classical physicists, it was viewed as a trick for finding solutions 
to the equations of electromagnetism. It took a surprisingly long time for physicists to 
appreciate the idea of gauge invariance as as an important principle in its own right. 
Fock was the first to realise, in 1926, that the action of gauge symmetry is intricately 
tied to the phase of the wavefunction in quantum mechanics [62]. The credit for viewing 
gauge symmetry (or “eichinvarianz”) as a desirable property of our theories of Nature 
is usually attributed to Weyl [203] although, as with many stories in the history of 
physics, his motivation now seems somewhat misplaced as he tried to prematurely 
develop a unified theory of gravity and electromagnetism [204]. (His approach survives 
in the Weyl invariance enjoyed by the worldsheet in string theory.) More historical 
background on the long road to the gauge principle can be found in [116, 150). 


Monopoles 


Debrett’s style guide for physics papers includes the golden rule: one idea per paper. 
Many authors flaunt this, but few flaunt it in as spectacular a fashion as Dirac. His 1931 
paper “Quantised Singularities in the Electromagnetic Field” [44] is primarily about the 
possibility of magnetic monopoles obeying the quantisation condition that now bears 
his name. But the paper starts by reflecting on the negative energy states predicted by 
the Dirac equation which, he is convinced, cannot be protons as he originally suggested. 
Instead, he argues, the negative energy states must correspond to novel particles, equal 
in mass to the electron but with positive charge. 


It seems that Dirac held anti-matter and magnetic monopoles, both predictions made 
within a few pages of each other, on similar footing [55]. He returned to the subject 
of monopoles only once, in 1948, elaborating on the concept of the “Dirac string” [45]. 
But the spectacular experimental discovery of anti-matter in 1932, followed by a long, 
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fruitless wait in the search for monopoles, left Dirac disillusioned. Fifty years after his 
first paper, and seemingly unconvinced by theoretical arguments (like “monopoles are 
heavy”), he wrote in a letter to Salam [46], 


“T am inclined now to believe that monopoles do not exist. So many years 
have gone by without any encouragement from the experimental side.” 


In the intervening years, the experimental situation has not improved. But monopoles 
now sit at the heart of our understanding of quantum field theory. This story took some 
decades to unfold and only really came to fruition with the discovery, by ’t Hooft [99] 
and Polyakov [158], of solitons carrying magnetic charge in non-Abelian gauge theories; 
this will be described in Section 2.8. 


As we saw in these lectures, the angular momentum of a particle-monopole pair has an 
extra anomalous term. This fact was noted long ago by Poincaré [155] in the charmingly 
titled short story “Remarques sur une expérience de M. Birkeland”. In 1936, the Indian 
physicist Meghnad Saha showed that the quantum version of this observation provides 
a re-derivation of the Dirac quantisation [171]. The paper is an ambitious, but flawed, 
attempt to explain the mass of the neutron in terms of a monopole-anti-monopole bound 
state, and the argument for which it is now remembered is dealt with in a couple of 
brief sentences. The angular momentum derivation was later rediscovered by H. Wilson 
[213], prompting an “I did it first” response from Saha [172]. The implication for the 
spin-statistics of monopoles was pointed out in [92] and [112]; a more modern take can 
be found in [137]. 


The extension of the Dirac quantisation condition to dyons was made by Zwanziger 
in 1968 [234], while the idea of patching gauge fields is due to Wu and Yang [231] 


There are many good reviews on magnetic monopoles. More details on the material 
discussed in this section can be found in the review by Preskill [164] or the book by Shnir 
[183]. More references are given in the next section when we discuss ’t Hooft-Polyakov 
monopoles. 


Topological Insulators 


The story of topological insulators started in the study of band structures, and the ways 
in which they can twist. The first examples are the TKNN invariant for the integer 
quantum Hall effect [194], and the work of Haldane on Chern insulators [88]. Both of 
these were described in the lectures on the quantum Hall effect [193]. For this work, 
Thouless and Haldane were awarded the 2016 Nobel prize. 
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The possibility that a topologically twisted band structure could exist in 3d materials 
was realised only in 2006. In July of that year, three groups posted papers on the arXiv 
[139, 170, 66], and in November of that year, Fu and Kane predicted the existence of 
this phase in a number of real materials [67]. This was quickly confirmed in experiments 
[109]. 


The effective field theory of a topological insulator, in terms of electrodynamics with 
0 = 7, was introduced by Qi, Hughes and Zhang in [165]. This took the subject away 
from its lattice underpinnings, and into the realm of quantum field theory. Indeed, 
Wilczek had already discussed a number of properties of electrodynamics in the presence 
of a theta angle [212], including the Witten effect [216]. The existence of the mirror 
monopole was shown in [166], and a number of further related effects were discussed in 
[54]. 


More details of topological insulators can be found in the reviews [91, 167, 17]. 
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2. Yang-Mills Theory 


Pure electromagnetism is a free theory of a massless spin 1 field. We can ask: is it 
possible to construct an interacting theory of spin 1 fields? The answer is yes, and the 
resulting theory is known as Yang-Mills. The purpose of this section is to introduce 
this theory and some of its properties. 


As we will see, Yang-Mills is an astonishingly rich and subtle theory. It is built upon 
the mathematical structure of Lie groups. These Lie groups have interesting topology 
which ensures that, even at the classical (or, perhaps more honestly, semi-classical) 
level, Yang-Mills exhibits an unusual intricacy. We will describe these features in 
Sections 2.2 and 2.3 where we introduce the theta angle and instantons. 


However, the fun really gets going when we fully embrace ñ and appreciate that 
Yang-Mills is a strongly coupled quantum field theory, whose low-energy dynamics 
looks nothing at all like the classical theory. Our understanding of quantum Yang- 
Mills is far from complete, but we will describe some of the key ideas from Section 2.4 
onwards. 


A common theme in physics is that Nature enjoys the rich and subtle: the most 
beautiful theories tend to be the most relevant. Yang-Mills is no exception. It is the 
theory that underlies the Standard Model of particle physics, describing both the weak 
and the strong forces. Much of our focus, and much of the terminology, in this section 
has its roots in QCD, the theory of the strong force. 


For most of this section we will be content to study pure Yang-Mills, without any 
additional matter. Only in Sections 2.7 and 2.8 will we start to explore how coupling 
matter fields to the theory changes its dynamics. We’ll then continue our study of the 
Yang-Mills coupled to matter in Section 3 where we discuss anomalies, and in Section 
5 where we discuss chiral symmetry breaking. 


2.1 Introducing Yang-Mills 


Yang-Mills theory rests on the idea of a Lie group. The basics of Lie groups and Lie 
algebras were covered in the Part 3 lectures on Symmetries and Particle Physics. We 
start by introducing our conventions. A compact Lie group G has an underlying Lie 
algebra g, whose generators T” satisfy 


eo |S (2.1) 
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Here a,b,c = 1,...,dimG and f% are the fully anti-symmetric structure constants. 
The factor of i on the right-hand side is taken to ensure that the generators are Her- 
mitian: (T°)! = T°. 


Much of our discussion will hold for general compact, simple Lie group G. Recall that 
there is a finite classification of these objects. The possible options for the group G, 
together with the dimension of G and the dimension of the fundamental (or minimal) 
representation F, are given by 


G dim G dim F 
SU(N) N?-—1 N 
SO(N) | $N(N — 1) N 
Sp(N) | NON +1)| 2N 


Es 78 27 
E; 133 56 
Es 248 248 
F; 52 6 
G3 14 7 


where we’re using the convention Sp(1) = SU(2). (Other authors sometimes write 
Sp(2n), or even USp(2n) to refer to what we’ve called Sp(N), preferring the argument 
to refer to the dimension of F rather than the rank of the Lie algebra g.) 


Although we will present results for general G, when we want to specialise, or give 
examples, we will frequently turn to G = SU(N). We will also consider G = U(1), in 
which case Yang-Mills theory reduces to Maxwell theory. 


We will need to normalise our Lie algebra generators. We require that the generators 
in the fundamental (i.e. minimal) representation F satisfy 


1 ab 
5 (2.2) 


tr T°T’ = 
In what follows, we use T° to refer to the fundamental representation, and will re- 
fer to generators in other representations R as T*(R). Note that, having fixed the 
normalisation (2.2) in the fundamental representation, other T°(R) will have different 
normalisations. We will discuss this in more detail in Section 2.5 where we'll extract 
some physics from the relevant group theory. 


i 


For each element of the algebra, we introduce a gauge field A”. These are then 
packaged into the Lie-algebra valued gauge potential 


A, = AST" (2.3) 


This is a rather abstract object, taking values in a Lie algebra. For G = SU(N), a 
more down to earth perspective is to view A, simply as a traceless N x N Hermitian 
matrix. 


We will refer to the fields Afi collectively as gluons, in deference to the fact that the 
strong nuclear force is described by G = SU(3) Yang-Mills theory. From the gauge 
potential, we construct the Lie-algebra valued field strength 


Fy = 0,A, — ðA, — i[Ay, Av] (2.4) 


Since this is valued in the Lie algebra, we could also expand it as Fi, = Fii,T°. In 
more mathematical terminology, A,, is called a connection and the field strength Fv is 
referred to as the curvature. We’ll see what exactly the connection connects in Section 
Pleas 


Although we won’t look at dynamical matter fields until later in this section, it will 
prove useful to briefly introduce relevant conventions here. Matter fields live in some 
representation R of the gauge group G. This means that they sit in some vector w 
of dimension dim R. Much of our focus will be on matter fields in the fundamental 
representation of G = SU(N), in which case w is an N-dimensional complex vector. 
The matter fields couple to the gauge fields through a covariant derivative, defined by 


Dw = Ona T tA (2.5) 
However, the algebra g has many different representations R. For each such repre- 


sentation, we have generators T(R) which we can can think of as square matrices of 
dimension dim R. Dressed with all their indices, they take the form 


TARY 69 =n. oshadi 
For each of these representations, we can package the gauge fields into a Lie alge- 
bra valued object At, To Ry ; We can then couple matter in the representation R by 


generalising the covariant derivative from the fundamental representation to 
Dub’ = 0, =A T(R hj =1,...,dimR (2.6) 
Each of these representations offers a different ways of packaging the fields A% 
into Lie-algebra valued objects A,. As we mentioned above, we will mostly focus on 


G = SU(N): in this case, we usually take JT” in the fundamental representation, in 
which case A,, is simply an N x N Hermitian matrix. 
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Aside from the fundamental, there is one other representation that will frequently 
arise: this is the adjoint, for which dim R = dimG. We could think of these fields as 
forming a vector ¢*, with a = 1,...,dimG, and then use the form of the covariant 
derivative (2.6). In fact, it turns out to be more useful to package adjoint valued 
matter fields into a Lie-algebra valued object, ¢ = “T”. In this language the covariant 
derivative can be written as 


Dud = On = i[A,, o] (2.7) 


The field strength can be constructed from the commutator of covariant derivatives. 
It’s not hard to check that 


[Du De = SEF uY 
The same kind of calculation shows that if ¢ is in the adjoint representation, 

Day Dilo = =F aA 
where the right-hand-side is to be thought of as the action of F on fields in the adjoint 
representation. More generally, we write [D,,, Dy] = —iF,,,, with the understanding that 
the right-hand-side acts on fields according to their representation. 
2.1.1 The Action 
The dynamics of Yang-Mills is determined by an action principle. We work in natural 
units, with h = c = 1 and take the action 


1 v 
SYM = 2g? fes tr FY Pius (2.8) 


where g? is the Yang-Mills coupling. (It’s often called the “coupling constant” but, as 
we will see in Section 2.4, there is nothing constant about it so I will try to refrain from 
this language). 


If we compare to the Maxwell action (1.10), we see that there is a factor of 1/2 
outside the action, rather than a factor of 1/4; this is accounted for by the further 
factor of 1/2 that appears in the normalisation of the trace (2.2). There is also the 
extra factor of 1/g? that we will explain below. 


The classical equations of motion are derived by minimizing the action with respect 
to each gauge field A’. It is a simple exercise to check that they are given by 


D, FY =0 (2.9) 


where, because F, is Lie-algebra valued, the definition (2.7) of the covariant derivative 
is the appropriate one. 
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There is also a Bianchi identity that follows from the definition of F,» in terms of 
the gauge field. This is best expressed by first introducing the dual field strength 


1 
i ae 56 ae 
and noting that this obeys the identity 
D, F” =0 (2.10) 


The equations (2.9) and (2.10) are the non-Abelian generalisations of the Maxwell 
equations. They differ only in commutator terms, both those inside D,, and those inside 
Fav. Even in the classical theory, this is a big difference as the resulting equations are 
non-linear. This means that the Yang-Mills fields interact with themselves. 


Note that we need to introduce the gauge potentials A, in order to write down 
the Yang-Mills equations of motion. This is in contrast to Maxwell theory where the 
Maxwell equations can be expressed purely in terms of E and B and we introduce 
gauge fields, at least classically, merely as a device to solve them. 


A Rescaling 


Usually in quantum field theory, the coupling constants multiply the interaction terms 
in the Lagrangian; these are terms which are higher order than quadratic, leading to 
non-linear terms in the equations of motion. 


However, in the Yang-Mills action, all terms appear with fixed coefficients determined 
by the definition of the field strength (2.4). Instead, we’ve chosen to write the (inverse) 
coupling as multiplying the entire action. This difference can be accounted for by a 
trivial rescaling. We define 


P Í z a - ste 
A,=—-A, and Fy, =0,A, — OA, —ig[Ag, A] 
g 
Then, in terms of this rescaled field, the Yang-Mills action is 
Sym =- | dstr F”Fp =- | de te POE 
YM = 2g? £ Ur w 7 z £ ir uv 
In the second version of the action, the coupling constant is buried inside the definition 


of the field strength, where it multiplies the non-linear terms in the equation of motion 
as expected. 
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In what follows, we will use the normalisation (2.8). This is the more useful choice in 
the quantum theory, where Sym sits exponentiated in the partition function. One way 
to see this is to note that g? sits in the same place as A in the partition function. This 
already suggests that g? — 0 will be a classical limit. Heuristically you should think 
that, for g? small, one pays a large price for field configurations that do not minimize 
the action; in this way, the path integral is dominated by the classical configurations. 
In contrast, when g? —> oo, the Yang-Mills action disappears completely. This is the 
strong coupling regime, where all field configurations are unsuppressed and contribute 
equally to the path integral. 


Based on this, you might think that we can just set g? to be small and a classical 
analysis of the equations of motion (2.9) and (2.10) will be a good starting point to 
understand the quantum theory. As we will see in Section 2.4, it turns out that this is 
not an option; instead, the theory is much more subtle and interesting. 

2.1.2 Gauge Symmetry 
The action (2.8) has a very large symmetry group. These come from spacetime- 
dependent functions of the Lie group G, 

Qz) EG 


The set of all such transformations is known as the gauge group. Sometimes we will be 
sloppy, and refer to the Lie group G as the gauge group, but strictly speaking it is the 
much bigger group of maps from spacetime into G. The action on the gauge field is 


A, > Q(x) A, Q(x) + iQ) L (a) (2.11) 
A short calculation shows that this induces the action on the field strength 
Fy > Q(x) Fu Q£) (2.12) 
The Yang-Mills action is then invariant by virtues of the trace in (2.8). 


In the case that G = U(1), the transformations above reduce to the familiar gauge 
transformations of electromagnetism. In this case we can write Q = e and the trans- 
formation of the gauge field becomes A,, > A, + ô w. 


Gauge symmetry is poorly named. It is not a symmetry of the system in the sense 
that it takes one physical state to a different physical state. Instead, it is a redundancy 
in our description of the system. This is familiar from electromagnetism and remains 
true in Yang-Mills theory. 
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There are a number of ways to see why we should interpret the gauge symmetry as 
a redundancy of the system. Roughly speaking, all of them boil down to the statement 
that the theory fails to make sense unless we identify states related by gauge transfor- 
mations. This can be see classically where the equations of motion (2.9) and (2.10) do 
not uniquely specify the evolution of A,,, but only its equivalence class subject to the 
identification (2.11). In the quantum theory, the gauge symmetry is needed to remove 
various pathologies which arise, such as the presence of negative norm states in the 
Hilbert space. A more precise explanation for the redundancy comes from appreciating 
that Yang-Mills theory is a constrained system which should be analysed as such using 
the technology of Dirac brackets; we will not do this here. 


Our best theories of Nature are electromagnetism, Yang-Mills and general relativity. 
Each is based on an underlying gauge symmetry. Indeed, the idea of gauge symmetry 
is clearly something deep. Yet it is, at heart, nothing more than an ambiguity in the 
language we chose to present the physics? Why should Nature revel in such ambiguity? 


There are two reasons why it’s advantageous to describe Nature in terms of a redun- 
dant set of variables. First, although gauge symmetry means that our presentation of 
the physics is redundant, it appears to be by far the most concise presentation. For 
example, we will shortly describe the gauge invariant observables of Yang-Mills theory; 
they are called “Wilson lines” and can be derived from the gauge potentials A,,. Yet 
presenting a configuration of the Yang-Mills field in terms of a complete set of Wilson 
lines would require vastly more information specifying the four matrix-valued fields A,,. 


The second reason is that the redundant gauge field allow us to describe the dynamics 
of the theory in a way that makes manifest various properties of the theory that we hold 
dear, such as Lorentz invariance and locality and, in the quantum theory, unitarity. This 
is true even in Maxwell theory: the photon has two polarisation states. Yet try writing 
down a field which describes the photon that has only two indices and which transforms 
nicely under the SO(3,1) Lorentz group; its not possible. Instead we introduce a field 
with four indices — A, — and then use the gauge symmetry to kill two of the resulting 
states. The same kind of arguments also apply to the Yang-Mills field, where there are 
now two physical degrees of freedom associated to each generator T°. 


The redundancy inherent in the gauge symmetry means that only gauge independent 
quantities should be considered physical. These are the things that do not depend on 
our underlying choice of description. In general relativity, we would call such objects 
“coordinate independent” , and it’s not a bad metaphor to have in mind for Yang-Mills. 
It’s worth pointing out that in Yang-Mills theory, the “electric field” E; = Fo; and the 
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“magnetic field” B; = — Sigh lik are not gauge invariant as they transform as (2.12). 
This, of course, is in contrast to electromagnetism where electric and magnetic fields 
are physical objects. Instead, if we want to construct gauge invariant quantities we 
should work with traces such as tr F,,f,, or the Wilson lines that we will describe 
below. (Note that, for simple gauge groups such as SU(N), the trace of a single field 
strength vanishes: tr F = 0.) 


Before we proceed, it’s useful to think about infinitesimal gauge transformations. To 
leading order, gauge transformations which are everywhere close to the identity can be 
written as 


O(a) ~ 1+ iw?(x)T* +... 
The infinitesimal change of the gauge field from (2.11) becomes 
0A, = 0,0 — iA w] = Dw 
where w = w*T7. Similarly, the infinitesimal change of the field strength is 
OF ee = ilw, Fg | 


Importantly, however, there are classes of gauge transformations which cannot be de- 
formed so that they are everywhere close to the identity. We will study these in Section 
2 


2.1.3 Wilson Lines and Wilson Loops 


It is a maxim in physics, one that leads to much rapture, that “gravity is geometry”. 
But the same is equally true of all the forces of Nature since gauge theory is rooted 
in geometry. In the language of mathematics, gauge theory is an example of a fibre 
bundle, and the gauge field A,, is referred to as a connection. 


We met the idea of connections in general relativity. There, the Levi-Civita connec- 
tion T£, tells us how to parallel transport vectors around a manifold. The Yang-Mills 
connection A, plays the same role, but now for the appropriate “electric charge”. First 
we need to explain what this appropriate charge is. 


Throughout this section, we will consider a fixed background Yang-Mills fields A,,(2). 
In this background, we place a test particle. The test particle is going to be under our 
control: we’re holding it and we get to choose how it moves and where it goes. But 
the test particle will carry an internal degree of freedom — this is the “electric charge” 
—and the evolution of this internal degree of freedom is determined by the background 
Yang-Mills field. 
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This internal degree of freedom sits in some representation R of the Lie group G. To 
start with, we will think of the particle as carrying a complex vector, w, of fixed length, 


w; i=1,...dimR © such that w'w = constant 


In analogy with QCD, we will refer to the electrically charged particles as quarks, and 
to w; as the colour degree of freedom. The w; is sometimes called chromoelectric charge. 


As the particle moves around the manifold, the connection A,, (or, to dress it with all 
its indices, (A,,); = A‘ (T*);) tells this vector w how to rotate. In Maxwell theory, this 
“parallel transport” is nothing more than the Aharonov-Bohm effect that we discussed 
in Section 1.1. Upon being transported around a closed loop C, a particle returns 
with a phase given by exp (i fo A). We’d like to write down the generalisation of this 
formula for non-Abelian gauge theory. For a particle moving with worldline x“(7), the 
rotation of the internal vector w is governed by the parallel transport equation 


dw dx” 
}— = —_A 2.13 
1 dr dr p(X) w ( ) 
The factor of 7 ensues that, with A,, Hermitian, the length of the vector wtw remains 
constant. Suppose that the particle moves along a curve C, starting at z} = a(7;) and 
finishing at ue = x!” (Tf). Then the rotation of the vector depends on both the starting 


and end points, as well as the path between them, 
w(Tt) = U |z: xp; C]w(m) 


where 
Tf dz! ay 
U|z;, xf; C] = P exp gı dr T alel) = P exp d! a) (2.14) 


where P stands for path ordering. It means that when expanding the exponential, we 
order the matrices A, (æ(T)) so that those at earlier times are placed to the left. (We 
met this notation previously in the lectures on quantum field theory when discussing 
Dyson’s formula and you can find more explanation there.) The object U|x;, xp; C] is 
referred to as the Wilson line. Under a gauge transformation Q(x), it changes as 


Ula: £f; C] > O(a, )U en xr; C] Qİ (xp) 


If we take the particle on a closed path C, this object tells us how the vector w differs 
from its starting value. In mathematics, this notion is called holonomy. In this case, 
we can form a gauge invariant object known as the Wilson loop, 


W(C] = trPexp ( f A) (2.15) 
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The Wilson loop W [C] depends on the representation R of the gauge field, and its value 
along the path C. This will play an important role in Section 2.5 when we describe 
ways to test for confinement. 


Quantising the Colour Degree of Freedom 


Above we viewed the colour degree of freedom as a vector w. This is a very classical 
perspective. It is better to think of each quark as carrying a finite dimensional Hilbert 
space Hauark, Of dimension dim Hguark = dim R. 


Here we will explain how to accomplish this. This will provide yet another perspec- 
tive on the Wilson loop. What follows also offers an opportunity to explain a basic 
aspect of quantum mechanics which is often overlooked when we first meet the subject. 
The question is the following: what classical system gives rise to a finite dimensional 
quantum Hilbert space? Even the simplest classical systems that we meet as under- 
graduates, such as the harmonic oscillator, give rise to an infinite dimensional Hilbert 
space. Instead, the much simpler finite dimensional systems, such as the spin of the 
electron, are typically introduced as having no classical analog. Here we’ll see that 
there is an underlying classical system and that it’s rather simple. 


We'll stick with a G = SU(N) gauge theory. We consider a single test particle and 
attach to it a complex vector w, but this time we will insist that w has dimension N. 
We will restrict its length to be 


ww= xK (2.16) 
The action which reproduces the equation of motion (2.13) is 
. dw i : i 
Su = | dr iw P +A(w'w — K) + w'A(x(T))w (2.17) 


where À is a Lagrange multiplier to impose the constraint (2.16), and where A = 
A, dx"/dr is to be thought of as a fixed background gauge field A,,(a) which varies in 
time in some fixed way as the particle moves along the path x"(7). 


Perhaps surprisingly, the action (2.17) has a U(1) worldline gauge symmetry. This 
acts as 


wore w and \3A+a 


for any a(r). Physically, this gauge symmetry means that we should identify vectors 
which differ only by a phase: w, ~ e’w,. Since we already have the constraint (2.16), 
this means that the vectors parameterise the projective space S?~!/U(1) = CP. 
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Importantly, our action is first order in time derivatives rather than second order. 
This means that the momentum conjugate to w is iw! and, correspondingly, CP’! 
is the phase space of the system rather than the configuration space. This, it turns 
out, is the key to getting a finite dimensional Hilbert space: you should quantise a 
system with a finite volume phase space. Indeed, this fits nicely with the old-fashioned 
Bohr-Sommerfeld view of quantisation in which one takes the phase space and assigns a 
quantum state to each region of extent ~ A. A finite volume then gives a finite number 
of states. 


We can see this in a more straightforward way doing canonical quantisation. The 
unconstrained variables w; obey the commutation relations 


[wi, wi] = dij (2.18) 


But we recognise these as the commutation relations of creation and annihilation op- 
erators. We define a “ground state” |0) such that w;|0) = 0 for all i =1,...,N. A 
general state in the Hilbert space then takes the form 

tye ty) E |0) (2.19) 
However, we also need to take into account the constraint (2.16). Note that this now 
arises as the equation of motion for the worldine gauge field À. As such, it is analogous 
to Gauss’ law when quantising Maxwell theory and we should impose it as a constraint 
that defines the physical Hilbert space. There is an ordering ambiguity in defining this 
constraint in the quantum theory: we chose to work with the normal ordered constraint 


(wlw; — «)|phys) = 0 


This tells us that the physical spectrum of the theory has precisely « excitations. In this 
way, we restrict from the infinite dimensional Hilbert space (2.19) to a finite dimensional 
subspace. However, clearly this restriction only makes sense if we take 


Ke Zt (2.20) 


This is interesting. We have an example where a parameter in an action can only 
take integer values. We will see many further examples as these lectures progress. 
In the present context, the quantisation of x means that the CP’~! phase space of 
the system has a quantised volume. Again, this sits nicely with the Bohr-Sommerfeld 
interpretation of dividing the phase space up into parcels. 
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For each choice of x, the Hilbert space inherits an action under the SU(N) symmetry. 
For example: 


e k = 0: The Hilbert space consists of a single state, |0}. This is equivalent to 
putting a particle in the trivial representation of the gauge group. 


e « = 1: The Hilbert space consists of N states, w! I0). This describes a particle 
transforming in the fundamental representation of the SU(N) gauge group. 


e k = 2: The Hilbert space consists of ¿N(N + 1) states, wtw}]|0), transforming in 
the symmetric representation of the gauge group. 


In this way, we can build any symmetric representation of SU(N). If we were to treat 
the degrees of freedom w; as Grassmann variables, and so replace the commutators in 
then it’s easy to convince yourself that 


(2.18) with anti-commutators, {w;, wt} = 075 


we would end up with particles in the anti-symmetric representations of SU(N). 


The Path Integral over the Colour Degrees of Freedom 


We can also study the quantum mechanical action (2.17) using the path integral. Here 
we fix the background gauge field A, and integrate only over the colour degrees of 
freedom w(T) and the Lagrange multiplier A(T). 


First, we ask: how can we see the quantisation condition of « (2.20) in the path 
integral? There is a rather lovely topological argument for this, one which will be 
repeated a number of times in subsequent chapters. The first thing to note is that the 
term KÀ in the Lagrangian transforms as a total derivative under the gauge symmetry. 
Naively we might think that we can just ignore this. However, we shouldn’t be quite 
so quick as there are situations where this term is non-vanishing. 


Suppose that we think of the worldline of the system, parameterised by r € S! rather 
than R. Then we can consider gauge transformations a(7) in which œa winds around 
the circle, so that f dr & = 2mn for some n € Z. The action (2.17) would then change 
as 


Sw Sy + 2TKN 


under a gauge transformation which seems bad. However, in the quantum theory it’s 


not the action Sw that we have to worry about but e’°” because this is what appears 


in the path integral. And e°» is gauge invariant provided that « € Z. 
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It is not difficult to explicitly compute the path integral. For convenience, we'll set 
k = 1, so we’re looking at objects in the N representation of SU(N). It’s not hard to 
see that the path integral over À causes the partition function to vanish unless we put 
in two insertions of w. We should therefore compute 
ZulA] = [Prudu! eS lwA u (T = co)w!(r = —00) 
The insertion at T = —oo can be thought of as placing the particle in some particular 


internal state. The partition function measures the amplitude that it remains in that 
state at T = +00 


We next perform the path integral over w and wt. This is tantamount to summing 
a series of diagrams like this: 


>= }/ = > Se ae ye, E ge 


where the straight lines are propagators for w; which are simply 6(7 — 72)d;;, while 


the dotted lines represent insertions of the gauge fields A. It’s straightforward to sum 
these. The final result is something familiar: 


Z,,[A] = tr P exp ( | dr A(r)) (2.21) 


This, of course, is the Wilson loop W[C]. We see that we get a slightly different 
perspective on the Wilson loop: it arises by integrating out the colour degrees of freedom 
of the quark test particle. 


2.2 The Theta Term 
The Yang-Mills action is the obvious generalisation of the Maxwell action, 


1 V 
Sym = -z5 | ae tr F Fw 


There is, however, one further term that we can add which is Lorentz invariant, gauge 
invariant and quadratic in field strengths. This is the theta term, 


So 


a I dx tr*FY" Fu (2.22) 


where *F⁄ = SEU Po Fr. Clearly, this is analogous to the theta term that we met in 
Maxwell theory in Section 1.2. Note, however, that the canonical normalisation of the 
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Yang-Mills theta term differs by a factor of z from the Maxwell term (a fact which is 
a little hidden in this notation because it’s buried in the definition of the trace (2.2)). 
We’ll understand why this is the case below. (A spoiller: it’s because the periodicity of 
the Maxwell theta term arises from the first Chern number, c;(A)? while the periodicity 
of the non-Abelian theta-term arises from the second Chern number c9(A).) 


The non-Abelian theta term shares a number of properties with its Abelian counter- 
part. In particular, 


e The theta term is a total derivative. It can be written as 


0 
So = 372 fe ô K” (2.23) 
where 
9; 
K" = ty (48,4. = “Av Ape} (2.24) 


This means that, as in the Maxwell case, the theta term does not change the 
classical equations of motion. 


e 0 is an angular variable. For simple gauge groups, it sits in the range 
6 € [0, 27) 


This follows because the total derivative (2.23) counts the winding number of 
a gauge configuration known as the Pontryagin number such that, evaluated on 
any configuration, Sg = On with n € Z. This is similar in spirit to the kind of 
argument we saw in Section 1.2.4 for the U(1) theta angle, although the details 
differ because non-Abelian gauge groups have a different topology from their 
Abelian cousins. We will explain this in the rest of this section and, from a 
slightly different perspective, in Section 2.3. 


There can, however, be subtleties associated to discrete identifications in the 
gauge group in which case the range of 0 should be extended. We’ll discuss this 
in more detail in Section 2.6. 


In Section 1.2, we mostly focussed on situations where 0 varies in space. This kind 
of “topological insulator” physics also applies in the non-Abelian case. However, as 
we mentioned above, the topology of non-Abelian gauge groups is somewhat more 
complicated. This, it turns out, affects the spectrum of states in the Yang-Mills theory 
even when @ is constant. The purpose of this section is to explore this physics. 
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2.2.1 Canonical Quantisation of Yang-Mills 


Ultimately, we want to see how the 0 term affects the quantisation of Yang-Mills. But 
we can see the essence of the issue already in the classical theory where, as we will now 
show, the 0 term results in a shift to the canonical momentum. The full Lagrangian is 


1 V 0 x V 
aay aad Fw + iga F” Fw (2.25) 


To start, we make use of the gauge redundancy to set 
Ao = 0 


With this ansatz, the Lagrangian becomes 


1 -o J 0 : 
f= atr (À - B?) +z trÀ-B (2.26) 
Here B; = — Sigel “k is the non-Abelian magnetic field (sometimes called the chromo- 


magnetic field). Meanwhile, the non-Abelian electric field is E; = A,. I’ve chosen not to 
use the electric field notation in (2.26) as the A terms highlight the canonical structure. 
Note that the 0 term is linear in time derivatives; this is reminiscent of the effect of a 
magnetic field in Newtonian particle mechanics and we will see some similarities below. 


The Lagrangian (2.26) is not quite equivalent to (2.25); it should be supplemented 
by the equation of motion for Ag. In analogy with electromagnetism, we refer to this 
as Gauss’ law. It is 


This is a constraint which should be imposed on all physical field configurations. 


The momentum conjugate to A is 


a 1 8 


From this we can build the Hamiltonian 
1 
H= i (E? + B°?) (2.28) 


We see that, when written in terms of the electric field E, neither the constraint (2.27) 


nor the Hamiltonian (2.28) depend on 6; all of the dependence is buried in the Poisson 
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bracket structure. Indeed, when written in terms of the canonical momentum m, the 
constraint becomes 


Diri =0 


where the would-be extra term D;B; = 0 by virtue of the Bianchi identity (2.10). 
Meanwhile the Hamiltonian becomes 
E 1 
— 72 2 
H = gtr (~~ 35B) maD 
It is this -dependent shift in the canonical momentum which affects the quantum 
theory. 


Building the Hilbert Space 


Let’s first recall how we construct the physical Hilbert space of Maxwell theory where, 
for now, we set 0 = 0. For this Abelian theory, Gauss’ law (2.27) is linear in A and 
it is equivalent to V - A = 0. This makes it simple to solve: the constraint kills the 
longitudinal photon mode, leaving us with two, physical transverse modes. We can then 
proceed to build the Hilbert space describing just these physical degrees of freedom. 
This was the story we learned in our first course on Quantum Field Theory. 


In contrast, things aren’t so simple in Yang-Mills theory. Now the Gauss’ law (2.27) 
is non-linear and it’s not so straightforward to solve the constraint to isolate only the 
physical degrees of freedom. Instead, we proceed as follows. We start by constructing an 
auxiliary Hilbert space built from all spatial gauge fields: we call these states |A (x, t)). 
The physical Hilbert space is then defined as those states |phys) which obey 


D,E; |phys) = 0 (2.29) 


Note that we do not set D;£; = 0 as an operator equation; this would not be compatible 
with the commutation relations of the theory. Instead, we use it to define the physical 
states. 


There is an alternative way to think about the constraint (2.29). After we’ve picked 
Ao = 0 gauge, we still have further time-independent gauge transformations of the form 


A > QAQ +i QVQ! 


Among these are global gauge transformations which, in the limit x —> oo, asymptote 
to Q — constant 4 1. These are sometimes referred to as large gauge transformations. 


= 4] = 


They should be thought of as global, physical symmetries rather than redundancies. 
A similar interpretation holds in Maxwell theory where the corresponding conserved 
quantity is electric charge. In the present case, we have a conserved charge for each 
generator of the gauge group. The form of the charge follows from Noether’s theorem 
and, for the gauge transformation Q = e™, is given by 


Q(w) = fèr tr (aw - dA) 
1 6g? 


1 
= E pes tr (DE; w) 


where we’ve used the fact that D;B; = 0. This is telling us that the Gauss’ law 
G° = (D,;E;)* plays the role of the generator of the gauge symmetry. The constraint 
(2.29) is the statement that we are sitting in the gauge singlet sector of the Hilbert 
space where, for all w, Q(w) = 0. 


2.2.2 The Wavefunction and the Chern-Simons Functional 


It’s rare in quantum field theory that we need to resort to the old-fashioned Schrodinger 
representation of the wavefunction. But we will find it useful here. We will think of 
the states in the auxiliary Hilbert space as wavefunctions of the form P(A). (Strictly 
speaking, these are wavefunctionals because the argument A(x) is itself a function.) 


In this language, the canonical momentum 7° is, as usual in quantum mechanics, 


T’ = —i0/6A;. The Gauss’ law constraint then becomes 


D,(-is) = (2.31) 


Meanwhile, the Schrödinger equation is 


ô 0 z 1 
Y = g°tr | —i— — —B | Y+ tr B?Y = EY 2.32 
Á pi (=i FB) ae one 


This is now in a form that should be vaguely familiar from our first course in quantum 


mechanics, albeit with an infinite number of degrees of freedom. All we have to do is 
solve these equations. That, you may not be surprised to hear, is easier said than done. 


=A? = 


We can, however, try to see the effect of the 0 term. Suppose that we find a physical, 
energy eigenstate — call it Vo(A) — that solves both (2.31), as well as the Schrödinger 
equation (2.32) with 6 = 0. That is, 


8y 
=72 0 
g tr 5A 


+ GiB, = BW, (2.33) 
Now consider the following state 
(A) = e WIA Wy(A) (2.34) 
where W (A) is given by 
BE T 2i 
W(A) = 32 fa x entr (Faa: + paid) (2.35) 


This is known as the Chern-Simons functional. It has a number of beautiful and subtle 
properties, some of which we will see below, some of which we will explore in Section 
8. It also plays an important role in the theory of the Quantum Hall Effect. Note that 
we’ve already seen the expression (2.35) before: when we wrote the 0 term as a total 
derivative (2.24), the temporal component was K? = 47?W. 


For now, the key property of W(A) that we will need is 


dW (A) 1 ijk 1 
= M8 Fig =B; 
OA; nz ik Ay? 
which gives us the following relation, 
OW(A) . iowa) OVo(A) 6 
ay jA, T aq EA) 


This ensures that W satisfies the Gauss law constraint (2.31). (To see this, you need 
to convince yourself that the D; in (2.31) acts only on dW,/dA; in the first term above 
and on B; in the second and then remember that D;B; = 0 by the Bianchi identity.) 
Moreover, if Yo obeys the Schrödinger equation (2.33), then Y will obey the Schrödinger 
equation (2.32) with general 0. 


The above would seem to show that if we can construct a physical state Vo with 
energy E when 6 = 0 then we can dress this with the Chern-Simons functional e” 
to construct a state Y which has the same energy Æ when 0 Æ 0. In other words, the 
physical spectrum of the theory appears to be independent of 0. In fact, this conclusion 
is wrong! The spectrum does depend on 6. To understand the reason behind this, we 


have to look more closely at the Chern-Simons functional (2.35). 
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Is the Chern-Simons Functional Gauge Invariant? 


The Chern-Simons functional W[A] is not obviously gauge invariant. In fact, not only 
is it not obviously gauge invariant, it turns out that it’s not actually gauge invariant! 
But, as we now explain, it fails to be gauge invariant in an interesting way. 


Let’s see what happens. In Ap = 0 gauge, we can still act with time-independent 
gauge transformations Q(x) € G, under which 


A> QAQ! + iQV Q! 


The spatial components of the field strength then changes as F;; > QF; Q7. It is not 
difficult to check that the Chern-Simons functional (2.35) transforms as 


1 , a = Tz = = = 
WA] = W[A] + wife fie JRO tr (NQ Ar) = 5° jkty (Q 19.00 19,Q.Q "an 
The first term is a total derivative. It has an interesting role to play on manifolds with 
boundaries but will not concern us here. Instead, our interest lies in the second term. 
This is novel to non-Abelian gauge theories and has a beautiful interpretation. 


To understand this interpretation, we need to understand something about the topol- 
ogy of non-Abelian gauge transformations. As we now explain, these gauge transfor- 
mations fall into different classes. 


We’ve already met the first classification of gauge transformations. Those with Q Æ 1 
at spatial infinity, S2, = OR®, are to be thought of as global symmetries. The remaining 
gauge symmetries have Q = 1 on S. These are the ones that we are interested in 
here. 


Insisting that Q — 1 at S2 is equivalent to working on spatial S3 rather than R3. 
Each gauge transformation with this property then defines a map, 


A(x): SHG 


Such maps fall into disjoint classes. This arises because the gauge transformations can 
“wind” around the spatial S°, in such a way that one gauge transformation cannot be 
continuously transformed into another. We’ll meet this kind of idea a lot throughout 
these lectures. Such maps are characterised by homotopy theory. In general, we will 
be interested in the different classes of maps from spheres S” into some space X. Two 
maps are said to be homotopic if they can be continuously deformed into each other. 
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The homotopically distinct maps are classified by the group II,,(X). For us, the relevant 
formula, is 


I13(G) = Z 


for all simple, compact Lie groups G. In words, this means that the winding of gauge 
transformations is classified by an integer n. This statement is intuitive for G = SU(2) 
since SU(2) S S3, so the homotopy group counts the winding of maps from S? +> S°. 
For higher dimensional G, it turns out that it’s sufficient to pick an SU(2) subgroup of 
G and consider maps which wind within that. It turns out that these maps cannot be 
unwound within the larger G. Moreover, all topologically non-trivial maps within G 
can be deformed to lie within an SU(2) subgroup. It can be shown that this winding 
is computed by, 


_ 1 
24r? 


f BS itr (QINNI ANT AN) (2.36) 
s3 


We claim that this expression always spits out an integer n(Q) € Z. This integer 
characterises the gauge transformation. It’s simple to check that n(Q1Q2) = n(Q1) + 


An Example: SU(2) 


We won’t prove that the expression (2.36) is an integer which counts the winding. 
We will, however, give a simple example which illustrates the basic idea. We pick 
gauge group G = SU(2). This is particularly straightforward because, as a manifold, 
SU (2) S S? and it seems eminently plausible that II3(S*) © Z. 


In this case, it is not difficult to give an explicit mapping which has winding number 
n. Consider the radially symmetric gauge transformation 


0Q,(x) = exp (win) = cos (=) +isin (=) oi. îi (2.37) 


where w(r) is some monotonic function such that 


fo r=0 
w(r) = 
Arn r=Oo 


Note that whenever w is a multiple of 47 then Q = e?" = 1. This means that 
as we move out radially from the origin, the gauge transformation (2.37) is equal to 
the identity n times, starting at the origin and then on successive spheres S? before it 
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reaches the identity the final time at infinity S2.. If we calculate the winding (2.36) of 
this map, we find 


nO.) =n 


For more general non-Abelian gauge groups G, one can always embed the winding 
Q,,(x) into an SU(2) subgroup. It turns out that it is not possible to unwind this 
by moving in the larger G. Moreover, the converse also holds: given any non-trivial 
winding Q(x) in G, one can always deform Q(x) until it sits entirely within an SU(2) 
subgroup. 


The Chern-Simons Functional is not Gauge Invariant! 


We now see the relevance of these topologically non-trivial gauge transformations. 
Dropping the boundary term, the transformation of the Chern-Simons functional is 


WJA] > WJA] +n 


We learn that the Chern-Simons functional is not quite gauge invariant. But it only 
changes under topologically non-trivial gauge transformations, where it shifts by an 
integer. 


What does this mean for our wavefunctions? We will require that our wavefunctions 
are gauge invariant, so that W(A’) = W(A) with A’ = QAQ + iQVQ!. Now, 
however, we see the problem with our dressing argument. Suppose that we find a 
wavefunction Yo(A) which is a state when 0 = 0 and is gauge invariant. Then the 
dressed wavefunction 


(A) = ce WIA Yo(A) (2.38) 


will indeed solve the Schrödinger equation for general 0. But it is not gauge invariant: 
instead it transforms as U(A’) = e”"W(A),. 


This then, is the way that the 0 angle shows up in the states. We do require that 
W(A) is gauge invariant which means that it’s not enough to simply dress the 0 = 0 
wavefunctions Vo(A) with the Chern-Simons functional e® WIA, 
go down this path, we must solve the 0 = 0 Schrodinger equation with the requirement 
that W9(A’) = e~”"W(A), so that this cancels the additional phase coming from the 
dressing factor so that U(A) is gauge invariant. 


Instead, if we want to 


There is one last point: the value of 0 only arises in the phase et” with n € Z. This, 
is the origin of the statement of that 0 is periodic mod 27. We take @ € [0, 27). 
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We have understood that the spectrum does depend on 0. But we have not un- 
derstood how the spectrum depends on 0. That is much harder. We will not have 
anything to say here, but will return to this a number of times in these lectures, both 
in Section 2.3 where we discuss instantons and in Section 6 when we discuss the large 
N expansion. 


2.2.3 Analogies From Quantum Mechanics 


There’s an analogy that exhibits some (but not all) of the ideas above in a much simpler 
setting. Consider a particle of unit charge, restricted to move on a circle of radius R. 
Through the middle of the circle we thread a magnetic flux ®. Because the particle sits 
away from the magnetic field, its classical motion is unaffected by the flux. Nonetheless, 
the quantum spectrum does depend on the flux and this arises for reasons very similar 
to those described above. 


Let’s recall how this works. The Hamiltonian for the particle is 


1/0, #3 
a 2m (2 i 7) 


We can now follow our previous train of logic. Suppose that we found a state Wo 


which is an eigenstate of the Hamiltonian when @ = 0. We might think that we could 
then just write down the new state V = e~'®*/2"8, which is an eigenstate of the 
Hamiltonian for non-zero ®. However, as in the Yang-Mills case above, this is too 
quick. For our particle on a circle, it’s not large gauge transformations that we have 
to worry about; instead, it’s simply the requirement that the wavefunction is single 


i@x/2nR 


valued. The dressing factor e is only single valued if © is a multiple of 27. 


Of course, the particle moving on a circle is much simpler than Yang-Mills. Indeed, 
there is no difficulty in just solving it explicitly. The single-valued wavefunctions have 
the property that they are actually independent of ®. (There is no reason to believe 
that this property also holds for Yang-Mills.) They are 

1 
Y= — es" nez 
2r R 
These solve the Schrödinger equation HY = EY with energy 
1 ®\* 
= Z 2.39 
2mR? ¢ i =| a Cea) 
We see that the spectrum of the theory does depend on the flux ©, even though the 


particle never goes near the region with magnetic field. Moreover, as far as the particle 
is concerned, the flux ® is a periodic variable, with periodicity 27. In particular, if ® 
is an integer multiple of 27, then the spectrum of the theory is unaffected by the flux. 


wA = 


The Theta Angle as a “Hidden” Parameter 


There is an alternative way to view the problem of the particle moving on a circle. We 
explain this here before returning to Yang-Mills where we offer the same viewpoint. 
This new way of looking at things starts with a question: why should we insist that the 
wavefunction is single-valued? After all, we only measure probability |W|?, which cares 
nothing for the phase. Does this mean that it’s consistent to work with wavefunctions 
that are not single-valued around the circle? 


The answer to this question is “yes”. Let’s see how it works. Consider the Hamilto- 
nian for a free particle on a circle of radius R, 
1. 0° 
H = —-~——~ 2.40 
2m ox? ot) 
In this way of looking at things, the Hamiltonian contains no trace of the flux. Instead, 
it will arise from the boundary conditions that we place on the wavefunction. We will 
not require that the wavefunction is single valued, but instead that it comes back to 
itself up to some specified phase ®, so that 


W(x + 27 R) = e? U(x) 
The eigenstates of (2.40) with this requirement are 


y= LL pilnt-0/2m)a/R 


27R 


nEZ 


The energy of these states is again given by (2.39). We learn that allowing for more 
general wavefunctions doesn’t give any new physics. Instead, it allows for a different 
perspective on the same physics, in which the presence of the flux does not appear 
in the Hamiltonian, but instead is shifted to the boundary conditions imposed on the 
wavefunction. In this framework, the phase ® is sometimes said to be a “hidden” 
parameter because you don’t see it directly in the Hamiltonian. 


We can now ask this same question for Yang-Mills. We’ll start with Yang-Mills theory 
in the absence of a 0 term and will see how we can recover the states with 6 4 0. Here, 
the analog question is whether the wavefunction V9(A) should really be gauge invariant, 
or whether we can suffer an additional phase under a gauge transformation. The phase 
that the wavefunction picks up should be consistent with the group structure of gauge 
transformations: this means that we are looking for a one-dimensional representation 
(the phase) of the group of gauge transformations. 
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Topologically trivial gauge transformations (which have n(Q) = 0) can be continu- 
ously connected to the identity. For these, there’s no way to build a non-trivial phase 
factor consistent with the group structure: it must be the case that Vo(A’) = Wo(A) 
whenever A’ = QAQ! +i10VQ7! with n(Q) = 0. 


However, things are different for the topologically non-trivial gauge transformations. 
As we’ve seen above, these are labelled by their winding n(Q) € Z. One could require 
that, under these topologically non-trivial gauge transformations, the wavefunction 
changes as 


P(A’) = e™™ Wo(A) (2.41) 


for some choice of 0 € [0, 27). This is consistent with consecutive gauge transformations 
because n(Q,Q2) = n(Q1) + n(Q2). In this way, we introduce an angle 0 into the 
definition of the theory through the boundary conditions on wavefunctions. 


It should be clear that the discussion above is just another way of stating our earlier 
results. Given a wavefunction which transforms as (2.41), we can always dress it with a 
Chern-Simons functional as in (2.38) to construct a single-valued wavefunction. These 
are just two different paths that lead to the same conclusion. We’ve highlighted the 
“hidden” interpretation here in part because it is often the way the 6 angle is introduced 
in the literature. Moreover, as we will see in more detail in Section 2.3, it is closer in 
spirit to the way the 0 angle appears in semi-classical tunnelling calculations. 


Another Analogy: Bloch Waves 


There’s another analogy which is often wheeled out to explain how 0 affects the states. 
This analogy has some utility, but it also has some flaws. Pll try to highlight both 
below. 


So far our discussion of the 0 angle has been for all states in the Hilbert space. For this 
analogy, we will focus on the ground state. Moreover, we will work “semi-classically” , 
which really means “classically” but where we use the language of wavefunctions. I 
should stress that this approximation is not valid: as we will see in Section 2.4, Yang- 
Mills theory is strongly coupled quantum theory, and the true ground state will bear 
no resemblance to the classical ground state. The purpose of what follows is merely to 
highlight the basic structure of the Hilbert space. 


With these caveats out the way, let’s proceed. The classical ground states of Yang- 
Mills are pure gauge configurations. This means that they take the form 


A=iWVVV" (2.42) 
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for some V(x) € G. But, as we’ve seen above, such configurations are labelled by the 
integer n(V). This is a slightly different role for the winding: now it is labelling the 
zero energy states in the theory, as opposed to gauge transformations. At the semi- 
classical level, the configurations (2.42) map into quantum states. Since the classical 
configurations are labelled by an integer n(V), this should carry over to the quantum 
Hilbert space. We call the corresponding ground states |n) with n € Z. 


If we were to stop here, we might be tempted to conclude that Yang-Mills has multiple 
ground states, |n). But this would be too hasty. All of these ground states are connected 
by gauge transformations. But the gauge transformations itself must have non-trivial 
topology. Specifically, if Q is a gauge transformation with n(Q) = n’ then Q|n) = 
jn +n’). 


The true ground state, like all states in the Hilbert space, should obey (2.41). For 
our states, this reads 


QAT) = ev) 


This means that the physical ground state of the system is a coherent sum over all the 
states |n). It takes the form 


(0) = y era (2.43) 


This is the semi-classical approximation to the ground state of Yang-Mills theory. These 
states are sometimes referred to as theta vacua. Once again, I stress that the semi- 
classical approximation is a rubbish approximation in this case! This is not close to 
the true ground state of Yang-Mills. 


Now to the analogy, which comes from condensed matter physics. Consider a particle 
moving in a one-dimensional periodic potential 


V(x)=V(x+a) 


Classically there are an infinite number of ground states corresponding the minima of 
the potential. We describe these states as |n) with n € Z. However, we know that these 
aren’t the true ground states of the Hamiltonian. These are given by Bloch’s theorem 
which states that all eigenstates have the form 


|k) = X e*™ In) (2.44) 
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for some k € [—a/a,7/a) called the lattice momentum. Clearly there is a parallel 
between (2.43) and (2.44). In some sense, the 0 angle plays a role in Yang-Mills similar 
to the combination ka for a particle in a periodic potential. This similarity can be traced 
to the underlying group theory structure. In both cases there is a Z group action on the 
states. For the particle in a lattice, this group is generated by the translation operator; 
for Yang-Mills it is generated by the topologically non-trivial gauge transformation 
with n(Q) = 1. 


There is, however, an important difference between these two situations. For the 
particle in a potential, all the states |k) lie in the Hilbert space. Indeed, the spec- 
trum famously forms a band labelled by k. In contrast, in Yang-Mills theory there is 
only a single state: each theory has a specific 0 which picks out one state from the 
band. This can be traced to the different interpretation of the group generators. The 
translation operator for a particle is a genuine symmetry, moving one physical state to 
another. In contrast, the topologically non-trivial gauge transformation Q is, like all 
gauge transformations, a redundancy: it relates physically identical states, albeit it up 
to a phase. 


2.3 Instantons 


We have argued that the theta angle is an important parameter in Yang-Mills, changing 
the spectrum and correlation functions of the theory. This is in contrast to electro- 
magnetism where 0 only plays a role in the presence of boundaries (such as topological 
insulators) or magnetic monopoles. It is natural to ask: how do we see this from the 
path integral? 


To answer this question, recall that the theta term is a total derivative 


0 
So = jes ee Fw = 32 jes Oui” 


167? 


where 
Ao A ð, A a A,A 
K = tr vU plio a a DE SDENT 
This means that if a field configuration is to have a non-vanishing value of Sọ, then it 
must have something interesting going on at infinity. 


At this point, we do something important: we Wick rotate so that we work in 
Euclidean spacetime R*. We will explain the physical significance of this in Section 
2.3.2. Configurations that have finite action Sym must asymptote to pure gauge, 


A, > iQ, Q! as x — oo (2.45) 
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with Q € G. This means that finite action, Euclidean field configurations involve a 
map 


Q(x): SS HG 


But we have met such maps before: they are characterised by the homotopy group 
II3(G) = Z. Plugging this asymptotic ansatz (2.45) into the action Sọ, we have 


where v € Z is an integer that tells us the number of times that Q(x) winds around 
the asymptotic S3, 


B 1 
24r? 


v(Q) f _ PS Ptr (QAI) (0,27) OAM) (2.47) 


This is the same winding number that we met previously in (2.36). 


This discussion is mathematically identical to the classification of non-trivial gauge 
transformations in Section 2.2.2. However, the physical setting is somewhat different. 
Here we are talking about maps from the boundary of (Euclidean) spacetime S3, 
while in Section 2.2.2 we were talking about maps from a spatial slice, R, suitably 
compactified to become S3. We will see the relationship between these in Section 2.3.2. 


2.3.1 The Self-Dual Yang-Mills Equations 
Among the class of field configurations with non-vanishing winding v there are some 
that are special: these solve the classical equations of motion, 

Df’ =0 (2.48) 


There is a cute way of finding solutions to this equation. The Yang-Mills action is 


1 
Sym = 2g? dtz tr Fp FH” 
Note that in Euclidean space, the action comes with a + sign. This is to be contrasted 
with the Minkowski space action (2.8) which comes with a minus sign. We can write 
this as 
8r? 


1 * 2 1 v 
SyM = 75 dx tO FiF Fw) +z | de tth FE 2 rad 
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where, in the last line, we’ve used the result (2.46). We learn that in the sector with 
winding v, the Yang-Mills action is bounded by 87?|v|/g?. The action is minimised 


when the bound is saturated. This occurs when 
Fao Py (2.49) 


These are the (anti) self-dual Yang-Mills equations. The argument above shows that 
solutions to these first order equations necessarily minimise the action is a given topo- 
logical sector and so must solve the equations of motion (2.48). In fact, it’s straightfor- 
ward to see that this is the case since it follows immediately from the Bianchi identity 
D,*FY" = 0. The kind of “completing the square” trick that we used above, where we 
bound the action by a topological invariant, is known as the Bogomolnyi bound. We’ll 
see it a number of times in these lectures. 


Solutions to the (anti) self-dual Yang-Mills equations (2.49) are known as instantons. 
This is because, as we will see below, the action density is localised at both a point in 
space and at an instant in (admittedly, Euclidean) time. They contribute to the path 
integral with a characteristic factor 


_@, _ 272 2 
e Sinstanton =e 8T lv\/g et? (2.50) 


Note that the Yang-Mills contribution is real because we’ve Wick rotated to Euclidean 
space. However, the contribution from the theta term remains complex even after Wick 
rotation. This is typical behaviour for such topological terms that sit in the action with 
epsilon symbols. 


A Single Instanton in SU(2) 
We will focus on gauge group G = SU (2) and solve the self-dual equations Fuy = *Fuv 


with winding number v = 1. As we’ve seen, asymptotically the gauge field must be 
pure gauge, and so takes the form A,, > i0,Q71. An example of a map Q(x) € SU (2) 
with winding v = 1 is given by 

_ Tp" 


OG) = o 


with this choice, the asymptotic form of the gauge field is given by? 


where o” = (1, —ið) 


1 : 
Ap iQ, Q! = wo’ astro 


a 
Poe 


3In the lecture notes on Solitons, the instanton solution was presented in singular gauge, where it 
takes a similar, but noticeably different form. 
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Here the niy are usually referred to as ’t Hooft matrices. They are three 4 x 4 matrices 
which provide an irreducible representation of the su(2) Lie algebra. They are given 
by 


0100 001 0 0 001 
1 -1000 2 0 00-1 3 0 010 
Tw = 0001 > w =~ | 1000 > Mw = 0-100 
0 0-10 0100 -1 0 00 


These matrices are self-dual: they obey Ciel = i This will prove important. 
(Note that we’re not being careful about indices up vs down as we are in Euclidean 
space with no troublesome minus signs.) The full gauge potential should now be of 
the form A, = if(2)Q0,Q7' for some function f(x) > 1 as x > oo. The right choice 
turns out to be f(x) = x?/(x? + p°) where p is a parameter whose role will be clarified 
shortly. We then have the gauge field 


re p Niwt” (2.51) 


You can check that the associated field strength is 


2? i i 
ET ep 
This inherits its self-duality from the ’t Hooft matrices and therefore solves the Yang- 
Mills equations of motion. 


The instanton solution (2.51) is not unique. By acting on this solution with various 
symmetries, we can easily generate more solutions. The most general solution with 
winding v = 1 depends on 8 parameters which, in this context, are referred to as 
collective coordinates. Each of them is has a simple explanation: 


e The instanton solution above is localised at the origin. But we can always generate 
a new solution localised at any point X € R4 simply by replacing z“ > x” — X” 
in (2.51). This gives 4 collective coordinates. 


e We’ve kept one parameter p explicit in the solution (2.51). This is the scale 
size of the instanton, an interpretation which is clear from looking at the field 
strength which is localised in a ball of radius p. The existence of this collective 
coordinate reflects the fact that the classical Yang-Mills theory is scale invariant: 
if a solution exists with one size, it should exist with any size. This property is 
broken in the quantum theory by the running of the coupling constant, and this 
has implications for instantons that we will describe below. 
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e The final three collective coordinates arise from the global part of the gauge 
group. These are gauge transformations which do not die off asymptotically, and 
correspond to three physical symmetries of the theory, rather than redundancies. 
For our purposes, we can consider a constant V € SU(2) , and act as A, > 
VA, V. 

Before we proceed, we pause to mention that it is straightforward to write down a 
corresponding anti-self-dual instanton with winding v = —1. We simply replace the ’t 
Hooft matrices with their anti-self dual counterparts, 


0-1 0 0 00-1 0 00 0-1 

al 2 _|000 = 3 - 10 @10 

™e |oo orf? “~ 100 0 > Mw ~ | 9-10 0 

0 -10 010 0 1000 
They obey Sues = =t) and one can use these to build a gauge potential (2.51) 
with v = —1. These too form an irreducible representation of su(2), and obey [n’, 7] = 


0. The fact that we can find two commuting su(2) algebras hiding in a 4 x 4 matrix 
reflects the fact that Spin(4) © SU(2) x SU(2) and, correspondingly, the Lie algebras 
are so(4) = su(2) $ su(2). 


General Instanton Solutions 
To get an instanton solution in SU(N), we could take the SU(2) solution (2.51) and 
simply embed it in the upper left-hand corner of an N x N matrix. We can then rotate 
this into other embeddings by acting with SU(N), modulo the stabilizer which leaves 
the configuration untouched. This leaves us with the action 
SU(N) 
S|U(N — 2) x U(2)] 


where the U(N — 2) hits the lower-right-hand corner and doesn’t see our solution, while 


the U(2) is included in the denominator because it acts like V in the original solution 
(2.51) and we don’t want to over count. The notation S|U(p) x U(q)] means that we 
lose the overall central U(1) C U(p) x U(q). The coset space above has dimension 
4N — 8. This means that the solution in which (2.51) is embedded into SU(N) comes 
with 4N collective coordinate. This is the most general v = 1 instanton solution in 
SU(N). 


What about solutions with higher v? There is a beautiful story here. It turns out 
that such solutions exist and have 4Nv collective coordinates. Among these solutions 
are configurations which look like v well separated instantons, each with 4N collec- 
tive coordinates describing its position, scale size and orientation. However, as the 
instantons overlap this interpretation breaks down. 
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V(x) V(x) 


Figure 8: The double well Figure 9: The upside down double well 


Remarkably, there is a procedure to generate all solutions for general v. It turns out 
that one can reduce the non-linear partial differential equations (2.49) to a straightfor- 
ward algebraic equation. This is known as the ADHM construction and is possible due 
to some deep integrable properties of the self-dual Yang-Mills equations. You can read 
more about this construction (from the perspective of D-branes and string theory) in 
the lectures on Solitons. 


2.3.2 Tunnelling: Another Quantum Mechanics Analogy 


We’ve found solutions in Euclidean spacetime that contribute to the theta dependence 
in the path integral. But why Euclidean rather than Lorentzian spacetime? The answer 
is that solutions to the Euclidean equations of motion describe quantum tunnelling. 


This is best illustrated by a simple quantum mechanical example. Consider the 
double well potential shown in the left-hand figure. Clearly there are two classical 
ground states, corresponding to the two minima. But we know that a quantum particle 
sitting in one minimum can happily tunnel through to the other. The end result is that 
the quantum theory has just a single ground state. 


How can we see this behaviour in the path integral? There are no classical solutions 
to the equations of motion which take us from one minimum to the other. However, 
things are rather different in Euclidean time. We define 


T=i1t 


After this Wick rotation, the action 
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We see that the Wick rotation has the effect of inverting the potential: V(x) > —V (x). 
In Euclidean time, the classical ground states correspond to the maxima of the inverted 
potential. But now there is a perfectly good solution to the equations of motion, 
in which we roll from one maximum to the other. We come to a rather surprising 
conclusion: quantum tunnelling can be viewed as classical motion in imaginary time! 


As an example, consider the quartic potential 
V(x) = X(a? — a°) (2.52) 


which has minima at x = +a. Then a solution to the equations of motion which 
interpolates between the two ground states in Euclidean time is given by 


(Tr) = atanh (2 — 7») (2.53) 


with w? = 8\a?/m. This solution is the instanton for quantum mechanics in the double 
well potential. There is also an anti-instanton solution that interpolates from x = +a 
to x = —a. The (anti)-instanton solution is localised in a region 1/w in imaginary time. 
In this case, there is just a single collective coordinate, To, whose existence follows from 
time translational invariance of the quantum mechanics. 


Returning to Yang-Mills, we now seek a similar tunnelling interpretation for the 
instanton solutions. In the semi-classical approximation, the instantons tunnel between 
the |n) vacua that we described in Section 2.2.3. Recall that the semi-classical vacuum 
is defined by A; = iV0;V—! on a spatial slice R, which we subsequently compactify to 
S’. The vacuum |n) is associated to maps V(x) : S? +> G with winding n, defined in 
(2.36). 


We noted previously that the construction of the vacua |n) in 
terms of winding relies on topological arguments which are simi- C e D 
lar to those which underlie the existence of instantons. To see the 
connection, we can take the definition of the instanton winding C) 
(2.47) and deform the integration region from the asymptotic a 
S?, = OR‘ to the two asympotic three spheres S% which we 


think of as the compactified R% spatial slices ar t = +00. We C m D 


can then compare the instanton winding (2.47) to the definition 
of the vacuum states (2.36), to write Figure 10: 


v(U) = n4 (U) - n- (U) 


We learn that the Yang-Mills instanton describes tunnelling between the two semi- 
classical vacua, |n_) — |n+)} = |n- +v), as shown in the figure. 


se 


2.3.3 Instanton Contributions to the Path Integral 


Given an instanton solution, our next task is to calculate something. The idea is to use 
the instanton as the starting point for a semi-classical evaluation of the path integral. 


We can first illustrate this in our quantum mechanics analogy, where we would like 
to compute the amplitude to tunnel from one classical ground state |x = —a) to the 
other |x = +a) over some time T. 


a(T)=+a 
lale ”T|—a) = vf Da(r) e See 
x(0)=—a 


with M a normalisation constant that we shall do our best to avoid calculating. There 
is a general strategy for computing instanton contributions to path integrals which we 
sketch here. This strategy will be useful in later sections (such as Section 7.2 and 8.3 
where we discuss instantons in 2d and 3d gauge theories respectively.) However, we’ll 
see that we run into some difficulties when applying these ideas to Yang-Mills theories 
in d = 3+ 1 dimensions. 


Given an instanton solution z(7), like (2.53), we write the general x(7) as 
x(7) = Z(r) + d2(T) 


and expand the Euclidean action as 
On e(T)| = Sinstanton + fo ôx Ada + O(bx) (2.54) 


Here Sinstanton = Sz[Z(T)]. There are no terms that are linear in 6x because X(T) solves 
the equations of motion. The expansion of the action to quadratic order gives the 
differential operator A. The semi-classical approach is valid if the higher order terms 
give sub-leading corrections to the path integral. For our quantum mechanics double 
well potential, one can check that this holds provided A < 1 in (2.52). For Yang-Mills, 
this requirement will ultimately make us think twice about the semi-classical expansion. 


Substituting the expansion (2.54) into the path integral, we’re left with the usual 
Gaussian integral. It’s tempting to write 


a(T)=+a 6x(T')=0 
f Dar) e Selz(7)] = e7 Sinstanton f Dox(T) eo 5tAda+O(52") 


e Sinstanton 


` dett?A 
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This, however, is a little too quick. The problem comes because the operator A has 
a zero eigenvalue which makes the answer diverge. A zero eigenvalue of A occurs if 
there are any deformations of the solution Z(7) which do not change the action. But 
we know that such deformations do indeed exist since the instanton solutions are never 
unique: they depend on collective coordinates. In our quantum mechanics example, 
there is a just a single collective coordinate, called 7) in (2.53), which means that the 
deformation dx = 0x/O7) is a zero mode: it is annihilated by A. 


To deal with this, we need to postpone the integration over any zero mode. These 
can then be replaced by an integration over the associated collective coordinate. For 
our quantum mechanics example, we have 


«(T)=+a s l ( `] T e7 Sinstanton 
Der) 6 a" ~ | dt) J ——_.— 
e (7) o | det A 


Here J is the Jacobian factor that comes from changing the integration variable from 
the zero mode to the collective coordinate. We will not calculate it here. Meanwhile 
the notation det’ means that we omit the zero eigenvalue of A when computing the 
determinant. The upshot is that a single instanton gives a saddle point contribution 
to the tunnelling amplitude, 
NJ 
det! /?A 
Note that we’ve packaged all the things that we couldn’t be bothered to calculate into 


laje T |—a) ~ KT e~*instenton with K = 


a single constant, K. 


The result above gives the contribution from a single instanton to the tunnelling 
amplitude. But, it turns out, this is not the dominant contribution. That, instead, 
comes from summing over many such tunnelling events. 


Consider a configurations consisting of a string of instantons and anti-instantons. 
Each instanton must be followed by an anti-instanton and vice versa. This configu- 
ration does not satisfy the equation of motion. However, if the (anti) instantons are 
well separated, with a spacing >> 1/w, then the configuration very nearly satisfies the 
equations of motion; it fails only by exponentially suppressed terms. We refer to this 
as a dilute gas of instantons. 


As above, we should integrate over the positions of the instantons and anti-instantons. 
Because each of these is sandwiched between two others, this leads to the integration 


T T T 
‘Le 
f d f aty... f dt; = — 
0 t tn—1 n! 


where we’re neglecting the thickness 1/w of each instanton. 
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A configuration consisting of n instantons and anti-instantons is more highly sup- 
pressed since its action is approximately Sinstanton. But, as we now see, these contri- 
butions dominate because of entropic factors: there are many more of them. Summing 
over all such possibilities, we have 


lale T-a) ~ D 7 (KT e Snsanon yt = sinh COE eke) 


n odd 


where we restrict the sum to n odd to ensure that we end up in a different classical 
ground state from where we started. We haven’t made any effort to normalise this 
amplitude, but we can compare it to the amplitude to propagate from the state |—a) 
back to |—a), 


(-ale"#7|-a) ~ X Z (KT e Snsanon yt = cosh (KTe~ Stanton) 


n even 


In the long time limit T > oo, we see that we lose information about where we started, 
and we’re equally likely to find ourselves in either of the ground states |a) or |—a). If 
we were more careful about the overall normalisation, we can also use this argument 
to compute the energy splitting between the ground state and the first excited state. 


As an aside, you may notice that the calculation above is identical to the argument 
for why there are no phase transitions in one dimensional thermal systems given in the 
lectures on Statistical Field Theory. 


Back to Yang-Mills Instantons 


Now we can try to apply these same ideas to Yang-Mills instantons. Unfortunately, 
things do not work out as nicely as we might have hoped. We would like to approximate 
the Yang-Mills path integral 


Vir [ea e SYM tiso 


by the contribution from the instanton saddle point. There are the usual issues related 
to gauge fixing, but these do not add anything new to our story so we neglect them 
here and focus only on the aspects directly related to instantons. (We’ll be more careful 
about gauge fixing in Section 2.4.2 when we discuss the beta function.) 


Let’s start by again considering the contribution from a single instanton. The story 
proceeds as for the quantum mechanics example until we come to discuss the collective 
coordinates. For the instanton in quantum mechanics, there was just a single collective 
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coordinate 7. For our Yang-Mills instanton in SU(2), there are eight. Four of these 
are associated to translations in Euclidean spacetime; these play the same role as To 
and integrating over them gives a factor of the Euclidean spacetime volume VT, with 
V the 3d spatial volume. Three of the collective coordinates arise from the global part 
of the gauge symmetry and can be happily integrated over. But this leaves us with the 
scale size p. This too should be singled out from the path integral and integrated over. 
We find ourselves with an integral of the form, 


Z% f dp K(p) VT e 8" 7 &® 
0 


where, as before, A(p) includes contributions from the Jacobians and the one-loop 
determinant. Now, however, it is a function of the instanton scale size p and so we 
should do the hard work of calculating it. 


We won’t do this hard work, in part because the calculation is rather involved and in 
part because, as we advertised above, the end result doesn’t offer quantitative insights 
into the behaviour of Yang-Mills. It turns out that A (p) causes the integral diverge 
at large p. This raises two concerns. First, it is difficult to justify the dilute instanton 
gas approximation if it is dominated by instantons of arbitrarily large size which are 
surely overlapping. Second, and more pressing, it is difficult to justify the saddle point 
expansion at all. This is because, as we describe in some detail in the next section, 
the gauge coupling in Yang-Mills runs; it is small at high energy but becomes large at 
low energies. This means that any semi-classical approximation, such as instantons, is 
valid for describing short distance processes but breaks down at large distances. The 
fact that our attempt to compute the partition function is dominated by instantons of 
large size is really telling us that the whole semi-classical strategy has broken down. 
Instead, we’re going to have to face up to the fact that Yang-Mills is a strongly coupled 
quantum field theory. 


It’s a little disappointing that we can’t push the instanton programme further in 
Yang-Mills. However, it’s not all doom and gloom and we won’t quite leave instan- 
tons behind in these lectures. There are situations where instantons are the leading 
contribution to certain processes. We will see one such example in Section 3.3.2 in 
the context of the anomaly, although for more impressive examples one has to look to 
supersymmetric field theories which are under greater control and beyond the scope of 
these lectures. 


2.4 The Flow to Strong Coupling 


Our discussion in the previous sections has focussed on the classical (or, at the very 
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least, semi-classical) approach to Yang-Mills. Such a description gives good intuition 
for the physics when a theory is weakly coupled, but often fails miserably at strong 
coupling. The next question we should ask is whether Yang-Mills theory is weakly or 
strongly coupled. 


We have chosen a scaling in which the coupling g? sits in front of the action 
1 V 
Sym = 392 f d'z tr F Fy (2.55) 


The quantum theory is defined, in the framework of path integrals, by summing over 
all field configurations weighted, with et9Y™m in Minkowski space or e~°¥™ in Euclidean 
space. When g? is small, the Euclidean action has a deep minimum on the solutions 
to the classical equations of motion, and these dominate the path integral. In this 
case, the classical field configurations provide a good starting point for a saddle point 
analysis. (In Minkowski space, the action is a stationary point rather than a minimum 
on classical solutions but, once again, these dominate the path integral.) In contrast, 
when 9° is large, many field configurations contribute to the path integral. In this case, 
we sometimes talk about quantum fluctuations being large. Now the quantum state 
will look nothing like the solutions to the classical equations of motion. 


All of this would seem to suggest that life is easy when g? is small, and harder when 
g° is large. However, things are not quite so simple. This is because the effective value 
of g? differs depending on the length scale on which you look: we write g? = g?(w), 
where u is an appropriate energy scale, or inverse length scale. Note that this is quite 
a radical departure from the the classical picture where any constants you put in the 
action remain constant. In quantum field theory, these constants are more wilful: they 
take the values they want to, rather than the values we give them. 


We computed the running of the gauge coupling g? at one-loop in our previous course 
on Advanced Quantum Field Theory. (We will review this computation in Section 2.4.2 
below.) The upshot is that the coupling constant depends on the scale ju as 

1 1 110d), Aby 


= ] 2. 
AQ z o B co) 


where gô is the coupling constant evaluated at the cut-off scale Avy. 


Here C'(adj) is a group theoretic factor. Recall that we have fixed a normalisation of 
the Lie algebra generators in the fundamental representation to be (2.2), 


f(r") = ” (2.57) 
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Having pinned down the normalisation in one representation, the other representations 
R will have different normalisations, 


tr [T*(R)T?(R)] = I(R) 6” 
The coefficient (R) is called the Dynkin index of the representation R. The convention 
(2.57) means that I(F) = 4. The group theoretic factor appearing in the beta function 
is simply the Dynkin index in the adjoint representation, 


C(adj) = I (adj) 


It is also known as the quadratic Casimir, which is why it is denoted by a different 
letter. For the various simple, compact Lie groups it is given by 


C(adj) N iN—1| N+1 | 2 | 3/2) 1/2] 3/2) 2 


Note that the adjoint representation of Eg is the minimal representation; hence the 
appearance of C (adj) = I(F) = 3. 
The running of the gauge coupling (2.56) is often expressed in terms of the beta 


function 


(2.58) 


The minus sign in (2.56) or, equivalently, in (2.58), is all important. It tells us that 
the gauge coupling gets stronger as we flow to longer length scales. In contrast, it is 
weaker at short distance scales. This phenomena is called asymptotic freedom. 


Asymptotic freedom means that Yang-Mills theory is simple to understand at high 
energies, or short distance scales. Here it is a theory of massless, interacting gluon 
fields whose dynamics are well described by the classical equations of motion, together 
with quantum corrections which can be computed using perturbation methods. In 
particular, our discussion of instantons in Section 2.3 is valid at short distance scales. 
However, it becomes much harder to understand what is going on at large distances 
where the coupling gets strong. Indeed, the beta function (2.58) is valid only when 
g°(u) <1. This equation therefore predicts its own demise at large distance scales. 
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We can estimate the distance scale at which we think we will run into trouble. Taking 
the one-loop beta function at face value, we can ask: at what scale does g?(j) diverge? 
This happens at a finite energy 


Agop = Ayy e289 (2.59) 


For historical reasons, we refer to this as the “QCD scale”, reflecting its importance in 
the strong force. Alternatively, we can write Agcp in terms of any scale p, 

Agcp = piet Poa" (u) 
and dAgcp/du = 0. For this reason, it is sometimes referred to as the RG-invariant 
scale. 


Asymptotic freedom means that 69 < 0. This ensures that if gj < 1, so that 
the theory is weakly coupled at the cut-off, then Agcp < Avuy. This is interesting. 
Yang-Mills theory naturally generates a scale Agcp which is exponentially lower than 
the cut-off Ayy of the theory. Theoretical physicists spend a lot of time worrying 
about “naturalness” which, at heart, is the question of how Nature generates different 
length scales. The logarithmic running of the coupling exhibiting by Yang-Mills theory 
provides a beautiful mechanism to do this. As we will see moving forwards, all the 
interesting physics in Yang-Mills occurs at energies of order Agcp. 


Viewed naively, there’s something very surprising about the emergence of the scale 
Agcp. This is because classical Yang-Mills has no dimensionful parameter. Yet the 
quantum theory has a physical scale, Agcp. It seems that the quantum theory has 
generated a scale out of thin air, a phenomenon which goes by the name of dimensional 
transmutation. In fact, as the definition (2.59) makes clear, there is no mystery about 
this. Quantum field theories are not defined only by their classical action alone, but 
also by the cut-off Ayy. Although we might like to think of this cut-off as merely a 
crutch, and not something physical, this is misleading. It is not something we can do 
without. And it this cut-off which evolves to the physical scale Agcp. 


The question we would like to ask is: what does Yang-Mills theory look like at low 
energies, comparable to Agcp? This is a difficult question to answer, and our current 
understanding comes primarily from experiment and numerical work, with intuition 
built from different analytic approaches. The answer is rather startling: Yang-Mills 
theory does not describe massless particles. Instead, the gluons bind together to form 
massive particles known as glueballs. These particles have a mass that is of the order 
of Agcp, but figuring out the exact spectrum remains challenging. We sometimes say 
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that the theory is gapped, meaning that there is a gap in the energy spectrum between 
the ground state, which we can take to have EF = 0, and the first excited state with 
energy E = Mc?, where M is the mass of the lightest glueball. 


Proving the mass gap for Yang-Mills is one of the most important and difficult open 
problems in mathematical physics. In these lectures we will restrict ourselves to building 
some intuition for Yang-Mills theory, and understanding some of the consequences of 
the mass gap. In later sections, will also see how the situation changes when we couple 
Yang-Mills to dynamical matter fields. 


Before we proceed, I should mention a rather subtle and poorly understood caveat. 
We have argued in Sections 2.2 and 2.3 that the dynamics of Yang-Mills theory also 
depends on the theta parameter and we can ask: how does 0 affect the spectrum? We 
have only a cursory understanding of this. It is thought that, for nearly all gauge groups, 
Yang-Mills remains gapped for all values of 6. However, something interesting happens 
at 0 = m. Recall from Section 1.2.5 that 6 = m is special because it preserves time- 
reversal invariance, more commonly known in particle physics as CP. For most gauge 
groups, it is thought that the dynamics spontaneously breaks time reversal invariance at 
0 = 7, so that Yang-Mills has two degenerate ground states. We will give an argument 
for this in Section 3.6 using discrete anomalies, and another in Section 6.2.5 when we 
discuss the large N expansion. However, there is speculation that the behaviour of 
Yang-Mills is rather different for gauge group G = SU(2) and that, while gapped for 
all 6 Æ a, this theory actually becomes gapless at 0 = m, where it is conjectured to 
be described by a free U(1) photon. We will have nothing to say about this in these 
lectures. 


2.4.1 Anti-Screening and Paramagnetism 


The computations of the 1-loop beta functions are rather involved. It’s useful to have 
a more down-to-earth picture in mind to build some understanding for what’s going 
on. There is nice intuitive analogy that comes from condensed matter. 


In condensed matter physics, materials are not boring passive objects. They contain 
mobile electrons, and atoms with a flexible structure, both of which can respond to 
any external perturbation, such as applied electric or magnetic fields. One consequence 
of this is an effect known as screening. In an insulator, screening occurs because an 
applied electric field will polarise the atoms which, in turn, generate a counteracting 
electric field. One usually describes this by introducing the electric displacement D, 
related to the electric field through 


D=cE 
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where the permittivity € = e9(1+ Xe) with Xe the electrical susceptibility. For all 
materials, Xe > 0. This ensures that the effect of the polarisation is always to reduce 
the electric field, never to enhance it. You can read more about this in Section 7 of the 
lecture notes on Electromagnetism. 


(As an aside: In a metal, with mobile electrons, there is a much stronger screening 
effect which turns the Coulomb force into an exponentially suppressed Debye-Hiickel, or 
Yukawa, force. This was described in the final section of the notes on Electromagnetism, 
but is not the relevant effect here.) 


What does this have to do with quantum field theory? In quantum field theory, the 
vacuum is not a passive boring object. It contains quantum fields which can respond 
to any external perturbation. In this way, quantum field theories are very much like 
condensed matter systems. A good example comes from QED. There the one-loop 
beta function is positive and, at distances smaller than the Compton wavelength of the 
electron, the gauge coupling runs as 


1 ty. a A? 
= J] UV 
e?(u) e 127? os ( p? ) © cy 


This tells us that the charge of the electron gets effectively QI) 
smaller as we look at larger distance scales. This can be ® 
understood in very much the same spirit as condensed & Q 


matter systems. In the presence of an external charge, 
electron-positron pairs will polarize the vacuum, as shown Figure 11: 

in the figure, with the positive charges clustering closer 

to the external charge. This cloud of electron-positron pairs shields the original charge, 


so that it appears reduced to someone sitting far away. 


The screening story above makes sense for QED. But what about QCD? The negative 
beta function tells us that the effective charge is now getting larger at long distances, 
rather than smaller. In other words, the Yang-Mills vacuum does not screen charge: it 
anti-screens. From a condensed matter perspective, this is unusual. As we mentioned 
above, materials always have %e > 0 ensuring that the electric field is screened, rather 
than anti-screened. 


However, there’s another way to view the underlying physics. We can instead think 
about magnetic screening. Recall that in a material, an applied magnetic field in- 
duces dipole moments and these, in turn, give rise to a magnetisation. The resulting 
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magnetising field H is defined in terms of the applied magnetic field as 
B = pH 


with the permeability u = Ho(1 + Xm). Here ym is the magnetic susceptibility and, in 
contrast to the electric susceptibility, can take either sign. The sign of Xm determines 
the magnetisation of the material, which is given by M = XmH. For —1 < ym < 0, 
the magnetisation points in the opposite direction to the applied magnetic field. Such 
materials are called diamagnets. (A perfect diamagnet has Xm = —1. This is what 
happens in a superconductor.) In contrast, when Xm > 1, the magnetisation points in 
the same direction as the applied magnetic field. Such materials are called paramagnets. 


In quantum field theory, polarisation effects can also make the vacuum either dia- 
magnetic or paramagnetic. Except now there is a new ingredient which does not show 
up in real world materials discussed above: relativity! This means that the product 
must be 


ceu=1 


because “1” is the speed of light. In other words, a relativistic diamagnetic material 
will have u < 1 and € > 1 and so exhibit screening. But a relativistic paramagnetic 
material will have u > 1 and e < 1 and so exhibit anti-screening. Phrased in this way, 
the existence of an anti-screening vacuum is much less surprising: it follows simply 
from paramagnetism combined with relativity. 


For free, non-relativistic fermions, we calculated the magnetic susceptibility in the 
lectures on Statistical Physics when we discussed Fermi surfaces. In that context, we 
found two distinct contributions to the magnetisation. Landau diamagnetism arose 
because electrons form Landau levels. Meanwhile, Pauli paramagnetism is due to the 
spin of the electron. These two effects have the same scaling but different numerical 
coefficients and one finds that the paramagnetism wins. 


In the next section we will compute the usual one-loop beta-function. We present the 
computation in such a way that it makes clear the distinction between the diamagnetic 
and paramagnetic contributions. Viewed in this light, asymptotic freedom can be traced 
to the paramagnetic contribution from the gluon spins. 


2.4.2 Computing the Beta Function 


In this section, we will sketch the derivation of the beta function (2.58). We’re going 
to use an approach known as the background field method. We work in Euclidean space 


6 /.= 


and decompose the gauge field as 
Ay = Ay + ôA, 


We will think of A, as the low-energy, slowly moving part of the field. It is known 
as the background field. Meanwhile, 0A, describes the high-energy, short-wavelength 
modes whose effect we would like to understand. The field strength becomes 


Fw = Fy + D, 6A, — D ôA, — i[5A,,5A,] 


where D, = 0,,—i[A,,°] is the covariant derivative with respect to the background field 
A,,. From this, we can write the action (2.55) as 


1 l- -= ae 
Sym = z f d'z tr Eza +2F”D,6A, 
+ DHsA’ D, ôA, — D8 A” DSA, — iF [8A , 6A] 
— 2uD”S A” [6 A,, 6A,] — sla", AJA, 8A] (2.60) 


where we’ve ordered the terms in the action depending on the number of 6A’s. Note 
that the middle line is quadratic in 6A. 


Gauge Fixing and Ghosts 


Our plan is to integrate over the fluctuations 0A,, in the path integral, leaving ourselves 
with an effective action for the background field A,,. To do this, we must first deal with 
the gauge symmetry. While the action of the gauge symmetry on A,, is clear, there is 
no unique decomposition into the action on A, and 6A,. However, the calculation is 
simplest if we load the full gauge transformation into 6A,,, so 


OgangeAp, =O and deange(SA,) = Dyuw — i[6A,, w] 


where, for this section alone, we’ve changed our notation for infinitesimal gauge trans- 
formations so as not to confuse them with the fluctuating field 6.A,. With this choice, 
ðA, transforms as any other adjoint field. 


As usual, field configurations related by a gauge symmetry should be viewed as 
physically equivalent. This is necessary in the present context because the kinetic 
terms for 0A,, are not invertible. For this reason, we first need a way to fix the gauge. 
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We do this using the Faddeev-Popov procedure that we saw in the lectures on Advanced 
Quantum Field Theory. We choose to work in the gauge 


G(A; 6A) = D'A, = 0 (2.61) 


Note that this gauge fixing condition depends on our choice of background field. This 
is the advantage of this method; we will find that the gauge invariance of A,, is retained 
throughout the calculation. 


We add to our action the gauge-fixing term 
1 = 
Sof = z I d x tr (DSA) (2.62) 


The choice of overall coefficient of the gauge fixing term is arbitrary. But nice things 
happen if we make the choice above. To see why, let’s focus on the D8 A” D,ô A, term 
in (2.60) . Integrating by parts, we have 


f d‘x tr DAD, 8A, = — 1 d‘x tr âA, DDSA, 


= I d'e trdA,([D*,D"] + DDr) 5A, 


I d'a tr |(DY5A,)? + iAP”, 5A,]| 


The first of these terms is then cancelled by the gauge fixing term (2.62), leaving us 
with 


1 le = ENEE 
Sym + Sof = z / d‘x tr Shae + 2F” D, ðA, 
+D"8 A” D, 5A, — 2i FH” [6A , 8A] 


~2iD"5A"6A,,, 6Ay] — =[6.4",6.4"][5.A,,, 5 Ay] 


1 

2 
and we’re left with just two terms that are quadratic in 6A. We’ll return to these 
shortly. 


The next step of the Faddeev-Popov procedure is to implement the gauge fixing 
condition (2.61) as a delta-function constraint in the path integral. We denote the 
gauge transformed fields as A? = A, and JAY = 5A, + Dw — i[bA,,w]. We then use 
the identity 


foo 5(G(A*, 5A”) det (ae zi 
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The determinant can be rewritten through the introduction of adjoint-valued ghost 
fields c. For the gauge fixing condition (2.61), we have 


1,54” 1 5 : 
det OG(A, 8A”) = fo Decl exp | -—— f tef — ţi De + ic! [Dt 8A, q| 
Ow g? 


where we’ve chosen to include an overall factor of 1/g? in the ghost action purely as a 
convenience; it doesn’t effect subsequent calculations. The usual Faddeev-Popov story 
tells us that the integration f Dw now decouples, resulting in a unimportant overall 
constant. We’re left with an action that includes both the fluctuating gauge field ôA, 
and the ghost field c, S = Sym + Sgf + Sghost, 


1 l= = v HUTY 
S= z f d'z faae" +2F” D, 0A, 
+D"5A" DOA, — 2i F" [6 A, 8AL] + Dpi Drc 
= 1 = 
-2i DA" (5A, 8AL] — 515.4", 0A"]|5A,,, 8A,] + ict [D"6A,,, c] 


As previously, we have arranged the terms so that the middle line is quartic in fluctu- 
ating fields, while the final line is cubic and higher. 


One-Loop Determinants 


Our strategy now is to integrate out the fluctuating fields, 0A, and c, to determine 
their effect on the dynamics of the background field A,. 


e Set [A] — fosa De Det e~SlAsA-d 


Things are simplest if we take our background field to obey the classical equations of 
motion, D F", which ensures that the term linear in 6A, in the action disappears. 
Furthermore, at one loop it will suffice to ignore the terms cubic and quadratic in 
fluctuating fields that sit on the final line of the action above. We’re then left just with 
Gaussian integrations, and these are easy to do, 


A =i 4c tr Fuu FEY 
e Sell — det HPA ige det P Aigi g apd Aa tef 
where the quadratic fluctuation operators can be read off from the action and are given 
by 
Att ge = -D0 + 2E] and Agost = -D 


gauge 


where the F“” should be thought of as an operator acting on objects in the adjoint 


representation. This extra term, F„„, arising from the gauge fields can be traced to 


Hv) 
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the fact that they are spin 1 excitations. As we will see below, this contributes the 
paramagnetic part to the beta function and, ultimately, is responsible for the famous 
minus sign that leads to anti-screening. 


Taking logs of both sides, the effective action is given by 


2 A ere | 
Segl A] = Dp jee tr Ey oP git log Avice — Tr log A ghost (2.63) 
where the Tr means the trace over group, Lorentz and momentum indices (as opposed 
to tr which is over only gauge group indices). We need to figure out how to compute 
the contributions from these quadratic fluctuation operators. 


The Ghost Contribution 


The contribution from the ghost fields are simplest because it has the least structure. 
We write 


ÁA ghost T —0? T Ai F A2 
where the subscripts keep track of how many A, terms each operator has, 
A, =i0"A, +iA,0" and A= [A",[A,,-]] 


where, again these operators act on objects in the adjoint representation. This will 
prove important to get the right normalisation factor. We then have 


Tr log Agnost = Tr log (—0? +Ay+ As) 
= Trlog(—0*) + Tr log (1 + (—07) "(Ai + Ag)) 


= Trlog(—2) + Tr (CPA + A) — FTF (0?) (An + Ao)? +... 


The first term is just an overall constant. We can ignore it. In the second term, Tr A, 
includes the trace over gauge indices and vanishes because tr A,, = 0. This is just the 
statement that there is no gauge invariant contribution to the kinetic term linear in 
A,,. So the first terms that we need to worry about are the quadratic terms. 


Q -meaa = f aay teal Am | SS 


where we’ve also included a graphical reminder of where these terms come from in a 
more traditional Feynman diagram approach. We also have 


won = FT (PY) = 5 fF tran BANK] x fob) 


=F] = 


with 


_ f dp (2p+ k) (2p+ k)” 
fuk) o p?(p + k)? 


Note that the trace over group indices should be taken with A, acting on adjoint valued 
objects, as opposed to our convention in (2.3) where it naturally acts on fundamental 
objects. 


We would like to massage these into the form of the Yang Mills action. In momentum 
space, the quadratic part of the Yang-Mills action reads 


Saa = fe tr (0,,A,0" A” = ð, A0, A") 
g 
1 dtk z = P , 
= a tr [A (k) A,(—&)| (kt k” — k? 8”) 


There are a couple of issues that we need to deal with. First, the Yang-Mills action 
is written in terms of fundamental generators which, as in (2.57), are normalised as 
tr T°T’? = 46°. Meanwhile, the trace in the one-loop contributions is in the adjoint 
representation, and is given by 


ital" = Clad) 0” 


Second, we must perform the integral over the loop momentum p. This, of course, 
diverges. These are the kind of integrals that were covered in previous QFT courses. 
We implement a UV cut-off Ayy to get 


C(adj) d*k A A Į 2 suv Noy 
—Tr log Å ghost == 3(47)2 I (O7) tr [A (k) AL(—)] (hr ke — k^” ) log 2 
This is our first contribution to the logarithmic running of the coupling that we adver- 
tised in (2.56). 


Above we focussed purely on the quadratic terms. Expanding the Yang-Mills action 
also gives us cubic and quadratic terms and, for consistency, we should check that they 
too receive the same corrections. Indeed they do. In fact, this is guaranteed to work 
because of the manifest gauge invariance ÔgaugeÁp = Dyw. 


= 2 e= 


The Gauge Contribution 


Next up is the contribution $Tr log Agauge, Where 


AM co = Aghost” + 24[F'™, -] 


gauge 
We see that part of the calculation involves Agnost; and so is gives the same answer as 
above. The only difference is the spin indices 6“” which give an extra factor of 4 after 
taking the trace. This means that 


Tr log Agauge = 4Tr log Agnost + F yy terms 


On rotational grounds, there is no term linear in F uv- This means that the first term 
comes from expanding out log Agauge to quadratic order and focussing on the F P terms, 


Fy terms = -2 (2i)? Tr (0 F (E, ]) 


lp dh o - dtp -Alkere — k75"”)(k6% — kpt) 
= -5 | Etal] f (27)4 p2(p +k)? 


Once again, we have a divergent integral to compute. This time we get 


F amesa d'k r|A A (— HEY — p25HY) Jo Nov 
Fy t (ny on t [A (k) AL( k)| (RP mea a e( 12 ) 


The sum then gives the contribution to the effective action, 


1 144 C(adj) dtk z : ‘ids Tarw Ay 
5 it log A gauge = 5 $ J (mp Jon tr [A (k) A (—k)] (k"k” — k?54”) log L2 


Here the 4/3 is the diagmagnetic contribution. In fact, it’s overkill since it neglects 


the gauge redundancy. This is subtracted by including the contribution from the ghost 
fields. Together, these give rise to a positive beta function. In contrast, the —8 term is 
the paramagnetic piece, and can be traced to the spin 1 nature of the gauge field. This 
is where the overall minus sign comes from. 


The coefficient of the kinetic terms is precisely the gauge coupling 1/g?. Combining 
both gauge and ghost contributions, and identifying the momentum k of the background 
field as the relevant scale u, we have 


wan ~ a ae ataa) he (Fe) 
E Í ou C (edi) T a 
g 3 (47) 


This is in agreement with the advertised result (2.58). As explained previously, the 
overall minus sign here is important. Indeed, it was worth a Nobel prize. 
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2.5 Electric Probes 


When we first studied Maxwell’s theory of Electromagnetism, one of the most basic 
questions we asked was: what’s the force between two charged particles? In these 
calculations, the charged particles are sources which we’ve inserted by hand; we’re 
using them as a probe of the theory, to see how the electromagnetic fields respond in 
their presence. In this section we will develop the tools that will allow us to ask similar 
questions about non-Abelian gauge theories. 


2.5.1 Coulomb vs Confining 


We start by building up some expectation from the classical physics. Asymptotic 
freedom means that these classical results will be valid when the particles are close by, 
separated by distances < 1/Agcp, but are unlikely to hold when they are far separated. 
Nonetheless, it will be useful to understand the theory in this regime, if only because 
it highlights just how surprising the long distance, quantum behaviour actually is. 


In electromagnetism, two particles of equal and opposite charges +e, separated by a 
distance r, experience an attractive Coulomb force. This can be described in terms of 
the potential energy V(r), 


e2 


Anr 


V(r) = 


In the framework of QED, we can reproduce this from the the tree- 
level exchange of a single photon, as shown in the figure. We did this 
in first course on Quantum Field Theory. 


Here we do the same calculation in SU(N) Yang-Mills theory. We 
refer to the charged particles as quarks. For now, we'll take these Figure 12: 
particles to sit in the fundamental representation of SU(N), although 
the methods we use here easily generalise to arbitrary gauge groups 
and representations. Each quark and anti-quark carries a colour index, i = 1,..., N. 
Moreover, when they exchange a gluon, this colour index can change. The tree-level 
diagram takes the same form, but with a gluon exchanged instead of a photon. It gives 


2 
g a pKa 


But we’ve still got those colour indices to deal with, 7,7 for ingoing, and k,l for out- 
going. We should think of T°T*® as an N? x N? matrix, acting on the N? different 
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ingoing colour states. These different N? states then split into different irreducible 
representations. For our quark and anti-quark, we have 


N@N=16@adj (2.65) 


where the adjoint representation has dimension N? — 1. The matrix T°T*® will then 
have two different eigenvalues, one for each of these representations. This will lead to 
two different coefficients for the forces. 


An Aside on Group Theory 


We need a way to compute the eigenvalues of T°7** in these two different represen- 
tations. In fact, we’ve met this kind of problem before; it’s the same kind of issue 
that arose in our lectures on Applications of Quantum Mechanics when we treated the 
spin-orbit coupling L-S of an atom. In that case we wrote J = L + S and used the 
identity L- S = (J? — L? + S?) = 4(j(7 +1) — I(l + 1) — s(s + 1)). 


We can repeat this trick for any group G. Consider two representations R, and Ro 
and the associated generators T“(R,) and T°(Rə). We construct a new operator 


S*(R) = T(R) @1 + 1@T*(R) 
We then have 
T°(R1) @ T" (Rə) = ; [S°(R)S°(R) + T?(R1)T° (R1) 8 1 + 1 8 T?(R2)T°(R2)] 


But it is simple to show that T¢(R)T*(R) commutes with all elements of the group 
and so is proportional to the identity, 


T(R)T*(R) = C(R)1 (2.66) 


where C(R) is known as the quadratic Casimir, a number which characterises the 
representation R. In our discussion of beta functions in Section 2.4, we encountered 
the Dynkin index, which is the coefficient of the trace normalisation 


tr T?(R)T’(R) = 1(R)6” 


The two are related by 
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where dim(G) is the dimension of the group and dim(R) is the dimension of the rep- 
resentation. Note that this consistent with our earlier claim that (adj) = C(adj). For 
G = SU(N), the fundamental and adjoint representations have 


. N?-—1 f 
C(N) =C(N) = and C(adj) =N 
2N 
while the symmetric and anti-symmetric || representations have 
IN = TUN +2) (N — 2)(N +1) 
C(co)= = and C ( ) = S 


Non-Abelian Coulomb Force 


Let’s now apply this to the force between quarks. The group theory machinations above 
tell us that the operator T°(Rı)T°(R2) decomposes into a block diagonal matrix, with 
entries labelled by the irreducible representations R C Rı ® Rz and given by 
1 
T*(Ry)T"(Ra)) = 3 (CUR) — C (R1) — C(Ra) 
The quark and anti-quark can sit in two different irreducible representations: the singlet 
and the adjoint (2.65). For the singlet, we have 
N? —1 
2N 
The minus sign ensures that the force between the quark and anti-quark in the singlet 


l fea) - CON) - CN] =- 


channel is attractive. This is what we would have expected from our classical intuition. 
However, when the quarks sit in the adjoint channel, we have 
1 . 1 
= |C'(adj) — C(N) — C(N)| = — 
5 (Cadi) — O(N) - C(Ñ)] = 5 
Perhaps surprisingly, this is a repulsive force. 

The group theory analysis above makes it simple to compute the classical force 
between quarks in any representation. Suppose, for example, we have two quarks, both 
in the fundamental representation. They decompose as 


NON= 
where dim( CO) = ¿N (N + 1) and dim( H) = $N(N — 1). We then have 
1 N-1 
5 (¢ (ca) - 0N) - oN) = 4 
and 
5 e (A) -0 - con] =- 


and the force is repulsive between quarks in the symmetric channel, but attractive in 
the anti-symmetric channel. 
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We see that, even classically, Yang-Mills theory provides a somewhat richer structure 
to the forces between particles. However, at the classical level, Yang-Mills retains the 
familiar 1/r fall-off from Maxwell theory. This is the signature of a force due to the 
exchange of massless particles in d = 3+1 dimensions, whether photons or gravitons or, 
in this case, gluons. As we now explain, at the quantum level things are very different. 


The Confining Force 


In the previous section, we stated (but didn’t prove!) that Yang-Mills has a mass 
gap. This means that, at distances > 1/Agcp, the force will be due to the exchange 
of massive particles rather than massless particles. In many situations, the exchange 
of massive particles results in an exponentially suppressed Yukawa force, of the form 
V(r) ~ e™"/r, and you might have reasonably thought this would be the case for 
Yang-Mills. You would have been wrong. 


Let’s again consider a quark and an anti-quark, in the N and N representations 
respectively. The energy between the two turns out to grow linearly with distance 


V(r)=o0r (2.67) 


for some value o that has dimensions of energy per length. For reasons that we will 
explain shortly, it is often referred to as the string tension. On dimensional grounds, 
we must have o ~ Ajgp since there is no other scale in the game. 


For two quarks, the result is even more dramatic. Now the tensor product of the two 
representations does not include a singlet (at least this is true for SU(N) with N > 3). 
The energy between the two quarks turns out to be infinite. This is a general property 
of quantum Yang-Mills: the only finite energy states are gauge singlets. The theory is 
said to be confining: an individual quark cannot survive on its own, but is forced to 
enjoy the company of friends. 


There is a possibility for confusion in the the claim that only singlet states survive 
in a confining gauge theory. In any gauge theory, one should only talk about gauge 
invariant states and a single quark is not a gauge invariant object. However, we can 
render the quark gauge invariant by attaching a Wilson line (2.14) which stretches 
from the position of the quark to infinity. When we blithely talk about a single quark, 
we should really be thinking of this composite object. This is not directly related to 
the issue of confinement. Indeed, the statements above hold equally well for electrons 
in QED: these too are only gauge invariant when attached to a Wilson line. Instead 
the issue of confinement is a dynamical statement, rather than a kinematical one. 
Confinement means that the quark + Wilson line costs infinite energy in Yang-Mills, 
while the electron + Wilson line (suitably regulated) costs finite energy in QED. 


= [7 = 


There are situations where it’s not possible to form a singlet from a pair of particles, 
but it is possible if enough particles are added. The baryon provides a good example, 
in which N quarks, each in the fundamental representation of SU(N), combine to form 
a singlet B = ¢''"Nq;, ...qiy. These too are finite energy states. 


Confinement in Yang-Mills is, like the mass gap, a challenging problem. There is no 
analytic demonstration of this phenomenon. Instead, we will focus on building some 
intuition for why this might occur and understanding the right language to describe it. 


2.5.2 An Analogy: Flux Lines in a Superconductor 


There is a simple system which provides a useful analogy for confinement. This is a 
superconductor. 


One of the wonders of the superconducting vacuum is its ability to expel magnetic 
fields. If you attempt to pass a magnetic field through a superconductor, it resits. This 
is known as the Meissner effect. If you insist, by cranking up the magnetic field, the 
superconductor will relent, but it will not do so uniformly. Instead, the magnetic field 
will form string-like filaments known as vortices. 


We can model this using the Abelian Higgs model. This is a U(1) gauge field, coupled 
to a complex scalar 


1 
S= fa'e - RP + Do -AO v?) 


with D, = 0,¢—-tA,¢@. (As an aside: in an actual superconductor, the complex scalar 
field describes the cooper pair of electrons, and should have a non-relativistic kinetic 
term rather than the relativistic kinetic terms we use here.) 


In the vacuum, the scalar has an expectation value, (|¢|) = v, spontaneously breaking 
the U(1) gauge symmetry and giving the photon a mass, m2 = 2e?v?. This is, of course, 
is the Higgs mechanism. In this vacuum, the scalar also has a mass given by ms, = 4)v". 


Let’s start by seeing how this explains the Meissner effect. We’ll look for time 
dependent solutions, with Ap = 0 and a magnetic field B’ = — $e" Fj. If we assume 
that the Higgs field doesn’t deviate from ọ = v then the equation of motion for the 
gauge field is 


V x B = -mA > V’B = m?B 


This is known as the London equation. It tells us that magnetic fields are exponentially 
damped in the Higgs phase, with solutions of the form B(x) = By e~™”. In the context 
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of superconductors, the length scale L = 1/m., is known as the penetration depth. Later 
another length scale, € ~ 1/mg, will also be important; this is called the correlation 
length. 


Of course, the assumption that @ = v is not justified: ¢ is a dynamical field and 
is determined by its equation of motion. This is where we will find the vortices. We 
decompose the complex scalar as 

Q = pe 

All finite energy, classical configurations must have p — v as x — oo. But the phase ø 
is arbitrary. This opens up an interesting topological possibility. Consider a classical 
configuration which is invariant in the x? direction, but is localised in the (x1, x?) plane. 
The translational invariance x? reflects the fact that we will be constructing an infinite 
string solution, aligned along x. We parameterise the plane by radial coordinates 
x! +ix? = re’. Then all configurations whose energy is finite when integrated over the 
(x1, x?) plane involve a map 


a(b) : SŁ S? (2.68) 


These maps fall into disjoint classes, labelled by the number of times that ø winds as 
we move around the asymptotic circle SL. This is the same kind of idea that we met 
when discussing theta vacua and instantons in Sections 2.2 and 2.3. In that case we 
were dealing with the homotopy group II3(S°); here we have a simpler situation, with 
maps of the form (2.68) classified by 


In this case, it is simple to write down an expression for the integer n € Z which 
classifies the map. It is the winding number, 


1 Oa 
naan fo og € 2 (2.69) 
In this way, the space of field configurations decompose into sectors, labelled by n € Z. 
The vacuum sits in the sector n = 0. A particularly simple way to find classical solutions 
is to minimize the energy in a sector n # 0. These solutions, which are stabilised by 
their winding at infinity, and are often referred to as topological solitons. In the present 
context, these solitons will the vortices that we are looking for. 
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AB > 


Figure 13: The profile for the magnetic field and Higgs field in a vortex. 


We’ll consider radially symmetric scalar profiles of the form 
g(r, 0) = plr)” (2.70) 


We will first see why any configuration with n 4 0 necessarily comes with a magnetic 
field. Because our configurations are invariant under x? translations, they will always 
have a linearly diverging energy corresponding to the fact that we have an infinite 
string. But the energy density in the (x1, x”) plane should integrate to a finite number. 
We denote the energy per unit length of the vortex string by ø. The kinetic term for 
the scalar gives a contribution to the energy that includes 


1ô g 
o~ f ardor -— — igo] ġo = f ardor 
r 00 


If we try to set Ag = 0, the energy has a logarithmic divergence from the integral over 


. 2 
Bel E 
r 


the (x1, x°) plane. To compensate we must turn on Ag > n/r as r — oo. But this 
means that the configuration (2.70) is accompanied by a magnetic flux 


= fex B; = fa rAg = 2mn (2.71) 


We see that the flux is quantised. This is the same quantisation condition that we saw 
for magnetic monopoles in Section 1.1 (albeit with a rescaled convention for the gauge 
field because we chose to put the coupling e? in front of the action). Note, however, that 
here we haven’t invoked any quantum mechanics; in the Higgs phase, the quantisation 
of flux happens for topological reasons, rather than quantum reasons. 


So far we have talked about configurations with winding, but not yet discussed 
whether they are solutions to the equations of motion. It is not hard to find solutions 
for a single vortex with n = 1 (or, equivalently, an anti-vortex with n = —1). We write 
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an ansatz for the gauge field as Ag = f(r)/r and require f(r) > 1 as r > oo. The 
equations of motion then reduce to ordinary differential equations for p(r) and f(r). 
Although no analytic solutions are known, it is simple to solve them numerically. These 
solutions are often referred to as Nielsen-Olesen vortices. 


Here we will build some intuition for what these look like without doing any hard 
work. The key feature is that ọ winds asymptotically, as in (2.70), which means that 
by the time we get to the origin it has something of an identity crisis and does not 
know which way to point. The only way in which the configuration can remain smooth 
is if ¢ = 0 at the origin. But it costs energy for ¢@ to deviate from the vacuum, so it 
must do so over as small a scale as possible. This scale is € ~ 1/mg. 


Similarly, we know that the flux (2.71) must be non-zero. It is energetically preferable 
for this flux to sit at the origin, since this is where the Higgs field vanishes. This flux 
spreads over a region associated to the penetration length L ~ 1/m,. The resulting 
profiles for the Higgs and magnetic fields are sketched in the figures. 


Type I, Type II and Bogomonlyi 


Before we explain why these vortices provide a good analogy for confinement, we first 
make a small aside. As described above, there are two length scales at play in the 
vortex solutions. The Higgs field drops to zero over a region of size ~ € while the 
magnetic field is spread over a region of size ~ L. 


The ratio of these two scales determines the force between two parallel vortices. For 
far separated vortices, the force is exponentially suppressed, reflecting the fact that the 
theory is gapped. As they come closer, either their magnetic flux will begin to overlap 
(if L > £), or their scalar profiles will begin to overlap (if € > L). The magnetic flux is 
repulsive, while the scalar field is attractive. Based on this distinction, superconductors 
are divided into two classes: 


Type I: € > L. In this case, the overlap of the scalar profiles of vortices provide the 


dominant, attractive force. If one applies a uniform magnetic field to a superconductor, 
it turns into one big vortex. But a big vortex is effectively the same as turning the 
system back into the normal phase. This means that the superconductor resists an 
applied magnetic field until it reaches a critical value, at which point the system exits 
the Higgs phase. This means that no vortices are seen in Type I superconductors. 


Type II: €< L. Now the magnetic flux of the vortices overlap are they approach, 
resulting in a repulsive force. This means that when a uniform magnetic field is applied 
to a Type II superconductor, it will form many vortices, each of which wants to be as 
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Figure 14: An Abrikosov lattice in a Type II superconductor. 


far from the others as possible. The result is a periodic array of vortices known as an 
Abrikosov lattice. An example is shown in the figure’. 


At the boundary between Type I and Type II superconductors, the heuristic ar- 
guments above suggest that there are no forces between vortices. Mathematically, 
something rather pretty happens at this point. We have m = m or, equivalently, 
A = e?/2. At this special value, we can write the tension of the vortex string as the 
sum of squares, 


o= | deB Y DoS ig- 


i=1,2 
_ f Fe Dom tD Da D 
+ (Ba + e(o — v2)? — Ba(lol? - v?) 


= jez [Did — Da0]? — ig [Di Dolo + z5 (B3 + (10l? —v?))? — Ballo}? - 0”) 


1 
= | de Dio + iDol? +5 — z (Bs + e(|¢)? — v) +v? B; 


where, in going to the last line, we used the fact that [D1, Da] = —iFi2 = +iB3. This 
“completing the square” trick is the same kind of Bogomolnyi argument that we used in 
Section 2.3 when discussing instantons. Since the two squares are necessarily positive, 
the energy can be bounded by 


ES fex v? Bs = 2nv?n 


4This picture is taken from P. Goa et al, Supercond. Sci. Technol. 14, 729 (2001). A nice gallery 
of vortex lattices can be found here. 
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Figure 15: The flux lines for a monopole Figure 16: The same flux lines in a su- 
and anti-monopole in vacuum. perconductor. 


where we have related the flux to the winding using (2.71). This is nice. In a sector 
with winding n > 0, there is a minimum energy bound. Moreover, we can saturate this 
bound by requiring that the quantities in the squares vanish, 


Did =iD2d and B3 = —e?(|d|? — v’) (2.72) 


These are the Bogomolnyi vortex equations. For n < 0, one can play a similar game 
with some minus signs shuffled around to derive Bogomolnyi equations for anti-vortices. 


The vortex equations (2.72) have a number of remarkable properties. In particular, 
it can be shown that the general solution has 2n parameters which, at least for far sep- 
arated vortices, can be thought of as the position of n vortices on the plane. Physically, 
this arises because there is no force between the vortices. You can read more about 
this in the lecture notes on Solitons. 


The Confinement of Monopoles 


So far we’ve reviewed some basic physics of the Higgs phase of electromagnetism. But 
what does this have to do with confinement? To see the connection, we need to think 
about what would happen if we place a Dirac monopole inside a superconductor. 


To get some grounding, let’s first consider a monopole and anti-monopole in vacuum. 
Their magnetic field lines spread out in a pattern that is familiar from the games we 
played with iron filings and magnets when we were kids. This is sketched in the left- 
hand figure. These field lines result in a Coulomb-like force between the two particles, 
V(r) ~ 1/r. 


Now what happens when we place these particles inside a superconductor? The 
magnetic flux lines can no longer spread out, but instead must form collimated tubes. 
This is sketched in the right-hand figure. This tube of flux is the vortex that we 
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Figure 17: A simulation of a separated Figure 18: A simulation a separated 
quark anti-quark pair in QCD. baryon state in QCD. 


described above As we have seen, happily the magnetic flux carried by a single vortex 
coincides with the magnetic flux emitted by a single Dirac monopole. The energy cost 
in separating the monopole and anti-monopole by a distance r is now 


Vir) =or 


where ø is the energy per unit length of the vortex string. In other words, inside a 
superconductor, magnetic monopoles are confined! 


What lesson for Yang-Mills can we take away from this? First, it seems very plausible 
that the confinement of quarks in Yang-Mills is again due to the emergence of flux lines, 
this time (chromo)electric rather than magnetic flux lines. However, in contrast to the 
Abelian Higgs model, the Yang-Mills flux tube is not expected to arise as a semi- 
classical solution of the Yang-Mills equations. Instead, the flux tube should emerge in 
the strongly coupled quantum theory where one sums over many field configurations. 
Indeed, such flux tubes are seen in lattice simulations where they provide dominant 
contributions to the path integral. An example is shown in the figure’. 


It is less obvious how these flux tubes form between N well separated quarks which 
form a baryon. Simulations suggest that the flux tubes emitted by each quarks can 
join together at an N-string vertex. The picture for a well separated baryon in QCD, 
with G = SU(3) gauge group, is shown in the figure. 


We might also wish to take away another lesson from the superconducting story. In 
the Abelian Higgs model, the electrically charged field @ condenses, resulting in the 
confinement of monopoles. Duality then suggests that to confine electrically charged 


5These simulations were created by Derek Leinweber. You can find a host of beautiful QCD 
animations on his webpage. 


— 84 — 


objects, such as quarks, we should look to condense magnetic monopoles. This idea 
smells plausible, but there has been scant progress in making it more rigorous in the 
context of Yang-Mills theory. (For what it’s worth, the idea can be shown to work in 
certain supersymmetric theories.) Nonetheless, it encourages us to look for magnetic 
objects in non-Abelian gauge theories. We will describe these in Sections 2.6 and 2.8. 


Regge Trajectories 


The idea that quark anti-quark pairs are held together by flux tubes has experimental 
support. Here we'll provide a rather simplistic model of this set up. Ignoring the overall 
translational motion, the energy of two, massless relativistic quarks, joined together by 
a string, is given by 


EHSp-or 
with p = pı — pz the relative momentum. We’ll embrace the spirit of Bohr, and require 


that the angular momentum is quantised: J = pr € Z. We can then write the energy 
as 


J 
E = — +or 
r 


For a fixed J, this is minimized at r = ,/J/o, which gives us the relationship between 
the energy and angular momentum of the states, 


E? ~oJd 


We can now compare this to the data for hadrons. ye 
A plot of the mass? vs spin is known as a Chew- 6 A 7 Aus) 
Frautschi plot. It is shown on the right for light vec- i200 
tor mesons®. We see that families of meson and their 
resonances do indeed sit on nice straight lines, re- 


ferred to as Regge trajectories. The slope of the lines i 
is determined by the QCD string tension, which 
turns out to be around o ~ 1.2 GeV?. Perhaps 
more surprisingly, the data also reveals nice straight Figure 19: 
Regge trajectories in the baryon sector. 


2.5.3 Wilson Loops Revisited 


Above we identified two different possible phases of Yang-Mills theory: the Coulomb 
phase and the confining phase. The difference between them lies in the forces experi- 
enced by two well-separated probe particles. 


This plot was taken from the paper by D. Ebert, R. Faustov and V. Galkin, arXiv:0903.5183. 
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e Coulomb: V(r) ~ 1/r 
e Confining: V(r) ~r 


To this, we could add a third possibility that occurs when the gauge field is Higgsed, 
so that electric charges are completely screened. In this case we have 


e Higgs: V(r) ~ constant 
We'll discuss this phase more in Section 2.7.3. 


Usually in a quantum field theory (or in a statistical field theory) we identify the 
phase by computing the expectation value of some order parameter. The question that 
we would like to ask here is: what is the order parameter for confinement? 


To answer this, we can rephrase our earlier discussion in terms of the path integral. 
To orient ourselves, let’s first return to Maxwell theory. If we want to compute the path 
integral in the presence of an electrically charged probe particle, we simply introduce 
the particle by its associated current J”, which now acts as a source. We then add to 
the action the term A,,J”. Moreover, for a probe particle which moves along a worldline 
C, the current J is a delta-function localised on C. We then compute the partition 
function with the insertion e’fe 4, 


(exp (: $ A) ) = f DA exp ( $ a) iSite (2.73) 


where we’re being a little sloppy on the right-hand-side, omitting both gauge fixing 
terms and the normalisation factor coming from the denominator. 


In Yang-Mills, there is a similar story. The only difference is that we can’t just 
stipulate a fixed current J” because the term A,,J” is not gauge invariant. Instead, we 
must introduce some internal colour degrees of freedom for the quark, as we described 
previously in Section 2.1.3. As we saw, integrating over these colour degrees of freedom 
leaves us with the Wilson loop W|C], which we take in the fundamental representation 


W[C] = tr Pexp Gz 


Performing the further path integral over the gauge fields A leaves us with the expec- 
tation value of this Wilson loop 


(wiol) = | DA tr P exp ( $ A) eiSym (2.74) 
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Now consider the specific closed loop C shown in the figure. We again take this to sit in 
the fundamental representation. It has the interpretation that we create a quark anti- 
quark pair, separated by a distance r, at some time in the past. These then propagate 
forward for time T, before they annihilate back to the vacuum. 


What behaviour would we expect from the expectation value r 
(W[C])? We’ll work in Euclidean space. Recall from our earlier lec- 


tures on quantum field theory that, for long times, the path integral 
projects the system onto the lowest energy state. Before the quarks 
appear, and after they’ve gone, this is the ground state of the system 
which we can take to have energy zero. (Actually, you can take it 


to have any energy you like; its contribution will disappear from our 


analysis when we divide by the normalisation factor that missing on 
the right-hand-side of (2.73) and (2.74).) However, in the presence Figure 20: 
of the sources, the ground state of the system has energy V(r). This 

means that we expect the Euclidean path integral to give 


lim (wiol) we VOT 


r, T= 
This now gives us a way to test for the existence of the confining the phase directly in 
Yang-Mills theory. If the theory lies in the confining phase, we should find 
lim (wiol) ag AE (2.75) 


r, Too 


where A[C] is the area of the the loop C. This is known as the area law criterion for 
confinement. We won’t be able to prove that Wilson loops in Yang-Mills exhibit an 
area law, although we’ll offer an attempt in Section 4.2 when we discuss the strong 
coupling expansion of lattice gauge theory. We will have more success in Section 7 and 
8 when we demonstrate confinement in lower dimensional gauge theories. 


If a theory does not lie in the confining phase, we get different behaviour for the 
Wilson loop. For example, we could add scalar fields which condense and completely 
break the gauge symmetry. This is the Higgs phase, and we will discuss it in more 
detail in Section 2.7 where we first introduce dynamical matter fields. In the Higgs 
phase, we have 

lim (wiol) ~ e HE 
r,T' 00 
where L = 2(r + T) is the perimeter of the loop and p is some mass scale associated 
to the energy in the fields that screen the particle. This kind of perimeter law is 
characteristic of the screening phase of a theory. 
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Wilson Loops as Operators 


There is a slightly different perspective on Wilson loops that will also prove useful: we 
can view them as operators on the Hilbert space of states. Since we are now dealing 
with Hilbert spaces and states, it’s important that we are back in Lorentzian signature. 


In quantum field theory, states are defined as living on a spacelike slice of the system. 
For this reason, we should first rotate our Wilson loop so that C is a spacelike, closed 
curve, sitting at a fixed point in time. The interpretation of the operator W[C] is 
that it adds to the state a loop of electric flux along C. To see this, we can again 
revert to the canonical formalism that we introduced in Section 2.2. The electric field 
is Et = —i6/5A;(x), so we have 


gwi =P (| f alwe) 


which indeed has support only on C. 


The expectation value (W[C]) is now interpreted as the amplitude for a loop of 
electric flux W[C]|0} to annihilate to the vacuum (0|. In the confining phase, this is 
unlikely because the flux tube is locally stable. The flux tube can, of course, shrink 
over time and disappear, but that’s not what (W|C]) is measuring. Instead, it’s looking 
for the amplitude that the flux tube instantaneously disappears. This can happen only 
through a tunnelling effect which, in Euclidean space, involves a string stretched across 
the flux tube acting. This Euclidean action of this string is proportional to its area, 
again giving (W[C]) ~ e~?4 with A[C] the minimal area bounding the curve. 


In contrast, in the Higgs phase the string is locally unstable. Each part of the 
string can split into pieces and dissolve away. This is still unlikely: after all, it has to 
happen at all parts of the string simultaneously. Nonetheless, it is more likely than the 
corresponding process in the confining phase, and this is reflected in the perimeter law 
(W[C]) ~ ee. 


2.6 Magnetic Probes 


Much of our modern understanding of gauge theories comes from the interplay between 
electric and magnetic degrees of freedom. In the previous section we explored how Yang- 
Mills fields respond to electric probes. In this section, we will ask how they respond to 
magnetic probes. 


— 88 — 


A warning: the material in this section is a little more advanced than what we covered 
until now and won’t be required for much of what follows. (An exception is Section 3.6 
which discusses discrete anomalies and builds on the machinery we develop here.) In 
particular, sections 2.7 and 2.8 can both be read without reference to this section. 


2.6.1 °t Hooft Lines 


Our first task is to understand how to construct an operator that corresponds to the 
insertion of a magnetic monopole. These are referred to as % Hooft lines. For electric 
probes, we could build the corresponding Wilson line out of local fields A,,. But there 
are no such fields that couple to magnetic charges. This means that we need to find a 
different way to describe the magnetic probes. 


We will achieve this by insisting that the fields of the theory have a prescribed singular 
behaviour on a given locus which, in our case, will be a line C in spacetime. Because 
such operators disrupt the other fields in the theory, they are sometimes referred to as 
disorder operators. 


*t Hooft Lines in Electromagnetism 


To illustrate this idea, we first describe ’t Hooft lines in U(1) electromagnetism. We 
have already encountered magnetic monopoles in Section 1.1. Suppose that a monopole 
of charge m traces out a worldline C in R31. (We referred to magnetic charge as g in 
Section 1.1, but this is now reserved for the Yang-Mills coupling so we have to change 
notation.) For any S? that surrounds C, we then have 


f Bedem (2.76) 
s2 


We normalise the U(1) gauge field to have integer electric charges. As explained in 
Section 1.1, the requirement that the monopole is compatible with these charges gives 
the Dirac quantisation condition (1.3), which now reads 


e”=1 > mE2Z (2.77) 


For the magnetic field to carry flux (2.76), we must impose singular boundary conditions 
on the gauge field. As an example, suppose that we take the line C to sit at the spatial 
origin x = 0 and extend in the temporal direction t. Then, as explained in Section 1.1 
we can cover the S? by two charts. Working in polar coordinates with A, = 0 gauge, 
in the northern hemisphere, we take the gauge field to have the singular behaviour 


m(1 — cos @) 


> 0 
2r sin a 


Ag > 
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There is a similar condition (1.7) in the southern hemisphere, related by a gauge trans- 
formation. 


We now define the ’t Hooft line T[C] by requiring that we take the path integral only 
over fields subject to the requirement that they satisfy (2.76) on C. This is a rather 
unusual definition of an “operator” in quantum field theory. Nonetheless, despite its 
unfamiliarity, , we can — at least in principle — use to compute correlation functions of 
T|C] with other, more traditional operators. 


’t Hooft Lines in Yang-Mills 


What’s the analogous object in Yang-Mills theory with gauge group G. To explain the 
generalisation of Dirac quantisation to an arbitrary, semi-simple Lie group we need to 
invoke a little bit of Lie algebra-ology that was covered in the Symmetries and Particles 
course. 


We work with a Lie algebra g. We denote the Cartan sub-algebra as H C g. Recall 
that this is a set of r mutually commuting generators, where r is the rank of the Lie 
algebra. Throughout the rest of this section, bold (and not silly gothic) font will denote 
an r-dimensional vector. 


We again define a ’t Hooft line for a timelike curve C sitting at the origin. We will 
require that the magnetic field B’, i = 1,2,3, takes the form 

Fi 
Agr? 


where Q(x) is a Lie algebra valued object which specifies the magnetic charge of the 


Bis Q(z) asr—>0 


’t Hooft line. Spherical symmetry requires that Q(x) be covariantly constant. We can 
again cover the S? with two charts, and in each pick Q(x) to be a constant which, by 
a suitable gauge transformation, we take to sit in the Cartan subalgebra. We write 


Q =m H 


for some r-dimensional vector m which determines the magnetic charge. We can think 
of this as r Dirac monopoles, embedded in the Cartan subalgebra. 


The requirement that the ’t Hooft lines are consistent in the presence of Wilson lines 
gives the generalised Dirac quantisation condition, 


exp (im-H)=1 (2.78) 


The twist is that this must hold for all representations of the Lie algebra. To see why 
this requirement affects the allowed magnetic charges, consider the case of G = SU(2). 
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We can pick a U(1) C SU(2) in which we embed a Dirac monopole of charge m. The 
W-bosons have electric charge q = +1 and are consistent with a ’t Hooft line of charge 
m = 2r. However, our ’t Hooft line should also be consistent with the insertion of a 
Wilson line in the fundamental representation, and this carries charge q = +1/2. This 
means that, for G = SU(2), the ’t Hooft line must carry m = 2, twice the charge of 
the simplest Dirac monopole. 


To extend this to a general group and representation, we need the concept of weights. 
Given a d dimensional representation, |ua) with a = 1,...,d of g, we may introduce a 
set of weights, which are the eigenvalues 


Ata) = Halha) (2.79) 
All such weights span the weight lattice Ay (g). 


The weights of the adjoint representation are special and are referred to as roots. 
Recall that these roots œ can be used to label the other generators of the Lie algebra, 
which are denoted as Eq. In the adjoint representation, the eigenvalue condition (2.79) 
becomes the commutation relation |H, Fa] = aEq. Importantly, the roots also span 
a lattice 


Aroot(g) C Aw(g) 


The weights and roots have the property that 
a-u il 


a? 2 


for all u € A„(g) and œ € Ajoot(g). This is exactly what we need to solve the Dirac 


quantisation condition (2.78), which becomes m-p € 27Z for all p E€ A.(g). We define 
the co-root 

v_2a 

= 

These co-roots also span a lattice, which we call Aco—root (g). Clearly, we have aY -u € Z 
for all aY € Aco—root( g) and u € Ay (g). If the magnetic charge vector sits in the co-root 
lattice, then the Dirac quantisation condition is obeyed. More generally, it turns out 
that for simply connected groups we have 


m E 2r Aco-root l9) (2.80) 


This is sometimes referred to as the Goddard-Nuyts-Olive (or GNO) quantisation con- 
dition. We will look at the possible magnetic charges for non-simply connected groups 
shortly. 
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There is one last part of this story. The co-root lattice can be viewed as the root 
lattice for a Lie algebra g“, so that Aco—root(g) = Aroot(g”) For simply laced algebras 
(these are the ADE series, and so includes su(N)), all roots have the same length and 
are normalised to œ? = 2. In this case, the roots and co-roots are the same and g“ = g. 
For non-simply laced groups, the long and short roots get exchanged. This means that, 
for example, so(2N + 1)“ = sp(N) and sp(N)Y = so(2n + 1). 


2.6.2 SU(N) vs SU(N)/Zn 


There seems to be something of an imbalance between the Wilson line operators and 
the ’t Hooft line operators. Of course, these electric and magnetic probes are defined 
in rather different ways, but that’s not our concern. Instead, it’s slightly disconcerting 
that there are more Wilson line operators than ’t Hooft line operators. This is because 
Wilson line operators are labelled by representations R which, in turn, are associated to 
elements of the weight lattice A,,(g). In contrast, ’t Hooft lines are labelled by elements 
of Ayoot(g’) which is a subset of A,,(gY). Roughly speaking, this means that Wilson 
lines can sit in any representation, including the fundamental, while ’t Hooft lines can 
only sit in representations that arise from tensor products of the adjoint. Why? 


To better understand the allowed magnetic probes, we need to look more closely 
at the global topology of the gauge group. We will focus on pure Yang-Mills with 
G = SU(N). Because the gauge bosons live in the adjoint representation, they are 
blind to any transformation which sits in the centre Zy C SU(N), 


Zy = ae k=0,1,....N-1} 


The gauge bosons do not transform under this centre Zy subgroup. In the older 
literature, it is sometimes claimed that the correct gauge group of Yang-Mills is actually 
SU(N)/Zy. But this is a bit too fast. In fact, the right way to proceed is to understand 
that there are two different Yang-Mills theories, defined by the choice of gauge group 


G= SU(N) or G=SU(N)/Zy 


Indeed, more generally we have a different theory with gauge group G = SU(N)/Z, 
for any Z, subgroup of Zy. The difference between these theories is rather subtle. We 
can’t distinguish them by looking at the action, since this depends only on the shared 
su(N) Lie algebra. Moreover, this means that the correlation functions of all local 
operators are the same in the two theories so you don’t get to tell the difference by 
doing any local experiments. Nonetheless, different they are. The first place this shows 
up is in the kinds of operators that we can use to probe the theory. 
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Figure 21: The figure on the left shows that allowed Wilson and ’t Hooft lines (in green) 
for the gauge group SU(3). The figure on the right shows the allowed lines for gauge group 
SU(3)/Zs. 


Let’s start with the Wilson lines. As we saw in Section 2.5, these are labelled by 
a representation of the group. The representations of G = SU(N)/Zy are a subset 
of those of G = SU(N); any representation that transforms non-trivially under Zy 
is prohibited. This limits the allowed Wilson lines. In particular, the theory with 
G = SU(N)/Zwy does not admit the Wilson line in the fundamental representation, 
but Wilson lines in the adjoint representation are allowed. Similarly, the theory with 
gauge group G = SU(N)/Zy cannot be coupled to fundamental matter; it can be 
coupled to adjoint matter. 


This has a nice description in terms of the lattices that we introduced. For G = 
SU(N), the representations are labelled by the weight lattice A,,(g). (The precise 
statement is that there is a one-to-one correspondence between representations and 
Aw(g)/W where W is the Weyl group.) However, for G = SU(N)/Zwy, the representa- 
tions are labelled by the root lattice Ayoot(g). Indeed, the difference between the weight 
and root lattice for g = su( N) is precisely the centre, 


Nail G)) Avon) = Zig 


Now we come to the ’t Hooft lines. When we introduced ’t Hooft lines in the previous 
section, we were implicitly working with the universal cover of the gauge group, so 
that all possible Wilson lines were allowed. The requirement that magnetic charges are 
compatible with all representations and, in particular, the fundamental representation, 
resulted in the GNO condition (2.80) in which ’t Hooft lines are labelled by Ajoot(g). 
But what if we work with G = SU(N)/Zy? Now we have fewer Wilson lines, and so 
the demands of Dirac quantisation are less onerous. Correspondingly, in this theory 
the ’t Hooft lines are labelled by A,,(g). 
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We can summarise the situation by labelling any line operator by a pair of integers 
(2°,2) E Zy x Zn (2.81) 


These describe how a given line operator transforms under the electric and magnetic 


centres of the group. If we have two line operators, labelled by (z°, z™) and (z’°, 2’) 


then Dirac quantisation requires z°z'™ — z™z'© = 0 mod N. Note the similarity with 


the quantisation condition on dyons (1.4) that we met earlier. 


For gauge group G = SU(N), the line operators are labelled by (z°,0) with 2° = 
0,...,N — 1. Note that this doesn’t mean that there are no magnetically charged ’t 
Hooft lines: just that these lines sit in the root lattice and so have z2™ = 0 mod N. 


In contrast, for G = SU(N)/Zy the line operators are labelled by (0,2) with 
z” = 0,...,N — 1. This time the Wilson lines must transform trivially under the 
centre of the group, so 2° = 0 mod N. The resulting line operators for G = SU(3) and 
G = SU(3)/Z3 are shown in Figure 21. Yang-Mills with G = SU(N) has more Wilson 
lines; Yang-Mills with G = SU(N)/Zy has more ’t Hooft lines. 


There is a slightly more sophisticated way of describing these different line operators 
using the idea of generalised symmetries. We postpone this discussion until Section 3.6 
where we will find an application in discrete anomalies. 


The Theta Angle and the Witten Effect 


The Witten effect gives rise to an interesting interplay between ’t Hooft lines and the 
theta angle of Yang-Mills. Recall from Section 1.2.3, that a Dirac monopole of charge 
m in Maxwell theory picks up an electric charge proportional to the 0 angle, given by 


_ Om 


1 On 


This analysis carries over to ’t Hooft lines in both Maxwell and Yang-Mills theories. 
In the latter case, a shift of 0 > 0 + 27 changes the electric charge carried by a line 
operator, 


6360427 => (2%,2™) > (2°4+2™,2™) 


For G = SU(N), this maps the spectrum of line operators back to itself. However, 
for G = SU(N)/Zy there is something of a surprise, because after a shift by 27, the 
spectrum of line operators changes. This is shown in Figure 22 for G = SU(3)/Zy. We 
learn that the theory is not invariant under a shift of 0 — 6+ 27. Instead, to return to 


— 94 — 


Zing a aN 

ORON KOROR | ONOR KOROR ORON HOROR ) 
O0Ce00C® o0 000o @ecoceco 
ORON KOLON | @ecoceco OR LoLo HO 
O-OH8-0-0-05 ORo Toron 5 oeoo 
O0@00e. C@0C0CeOC. e0oceood 
ORON KOROR ` eo0ceocdo OR ORo HO 


Figure 22: The spectrum of dyonic line operators in gauge group SU (3)/Z3, shown for 6 = 0 
(on the left), 0 = 2r (in the middle) and 6 = 4r (on the right). 


our original theory, with the same line operators, we must send 0 + 6+ 27N. In other 
words, 


G = SU(N) has@€[0,27) , G=SU(N)/Zy has 8 € [0, 27N) 


We’ll explore some consequences of this in Section 3.6 when we discuss anomalies in 
discrete symmetries. 


One of the arguments we gave in Section 2.2 for the periodicity 0 € [0,27) was the 
appropriate quantisation of the topological charge f dfx tr*F"” F,,,. Instantons provide 
solutions to the equations of motion with non-vanishing topological charge. For Yang- 
Mills with G = SU(N)/Zy, the enlarged range of 0 suggests that there might be 
“fractional instantons”, configurations that carry 1/N‘ the charge of an instanton. 
In fact, there are no such non-singular configurations on R*. But these fractional 
instantons do arise on manifolds with non-trivial topology. For example, if we take 
Euclidean spacetime to be T*, we can impose twisted boundary conditions in which, 
upon going around any circle, gauge fields come back to themselves up to a gauge 
transformation which lies in the centre Zy. Such boundary conditions are allowed for 
gauge group G = SU(N)/Zy, but not for G = SU(N). One can show that these 
classes of configurations carry the requisite fractional topological charge. 


’t Hooft Lines as Order Parameters 


One of the primary motivations for introducing line operators is to find order parameters 
that will distinguish between different phases of the theory. When G = SU(N) we 
have the full compliment of Wilson lines. As we saw in Section 2.5, an area law for the 
fundamental Wilson loop signals that the theory lies in the confining phase, which is 
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the expected behaviour for pure Yang-Mills. If we also add scalar fields to the theory, 
these could condense so that we sit in the Higgs phase; in this case the Wilson loop 
exhibits a perimeter law. 


If the gauge group is G = SU(N)/Zwy, we no longer have the fundamental Wilson 
line at our disposal. Instead, we have the fundamental ’t Hooft line with z” = 1, 
and this now acts as our order parameter. Since the local dynamics is independent of 
the global topology of the gauge group, pure Yang-Mills theory is again expected to 
confine. But, as in our discussion of superconductors in Section 2.5.2, the confinement 
of electric charge is equivalent to the screening of magnetic charge. This means that 
the signature of electric confinement is now a perimeter law for the ’t Hooft line. 


We can also consider G = SU(N)/Zy Yang-Mills in the Higgs phase. The theory 
does not admit scalar fields in the fundamental representation, so we introduce adjoint 
scalars which subsequently condense. A single adjoint scalar will break the gauge 
group to its maximal torus, U(1)‘~!, but with two misaligned adjoint Higgs fields we 
can break the gauge symmetry completely. This is the Higgs phase. As described in 
Section 2.5.2, the Higgs phase can be thought of as confinement of magnetic charges. 
Correspondingly, the ’t Hooft line now exhibits an area law. 


That’s All Well and Good, but... 


The difference between Yang-Mills with G = SU(N) and G = SU(N)/Zwy seems rather 
formal. As we mentioned above, all correlation functions of local operators in the two 
theories coincide, which means that any local experiment that we can perform will 
agree. The theories only differ in the kinds of non-local probes that we can introduce. 
You might wonder whether this is some pointless intellectual exercise. 


If we consider Yang-Mills on flat Rt, then there is some justification in ignoring 
these subtleties: the physics of the two theories is the same, and we’re just changing 
the way we choose to describe it. However, even in this case these subtleties will help 
us say something non-trivial about the dynamics as we will see in Section 3.6 when we 
discuss discrete anomalies. 


The real differences between the two theories arise when we study them on back- 
ground manifolds with non-trivial topology. Here the two theories can have genuinely 
different dynamics. Perhaps the most straightforward case arises for Yang-Mills coupled 
to a single, massless adjoint Weyl fermion. This theory turns out to have supersym- 
metry and goes by the name of VV = 1 super Yang-Mills. Although supersymmetry is 
beyond the scope of these lectures, it turns out that it provides enough of a handle for 
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us to make quantitative statements about their dynamics. If we consider these theories 
on spacetime R?! x S!, the low energy dynamics, specifically the number of ground 
states, does depend on the global topology of the gauge group. 


2.6.3 What is the Gauge Group of the Standard Model? 


We all know the answer to the question in the heading. The gauge group of the Standard 
Model is 
G =U(1)y x SU(2) x SU(3) 


Or is it? 


The fermions in a single generation sit in the following representations of G, 
Leptons: lg: (2,1) => (4,24)y = (1,0)-3 
er: (1,1)-6 = (25,z5)y = (0,0)-6 
Quarks: qr: (2,3)4. > (2323)y = (1, DH 
ur: (1,3) => (23,23)y = (0, 1)44 
dr: (1,3)-9 = (25,z5)y = (0,1)-2 
where the subscript denotes U(1)y hypercharge Y, normalised so that Y € Z. We could 
add to this the right-handed neutrino vg which is a gauge singlet. In the table above, 


we have also written the charges z§ and z§ under the Za x Zs; centre of SU (2) x SU (3). 
Finally, the Higgs boson sits in the representation (2,1)3 = (2§, 2$)y = (1,0)s. 
Each of these representations has the property that 
Y=32,—22, mod 6 
This means that there is a Zę subgroup of G = U(1)y x SU(2) x SU(3) under which 
all the fields are invariant: we must simultaneously act with the Zg = Zə x Z3 centre 
of SU(2) x SU(3), together with a Ze C U(1)y. Because nothing transforms under 


this Zg subgroup, you can sometimes read in the literature that the true gauge group 
of the Standard Model is 

U(1 SU(2) x SU(3 

where I’ = Zę. But this is also too fast. The correct statement is that there is a fourfold 
ambiguity in the gauge group of the Standard Model: it takes the form (2.82), where 
I is a subgroup of Ze, i.e. 


G= 


T= il, Zo, Z3, or Ze 


We note in passing that we can embed the Standard Model in a grand unified group, 
such as SU (5) or Spin(10), only if IT = Ze. 


2/97 = 


As we mentioned above, the choice of I does not affect any local correlations functions 
and, in particular, does not affect physics at the LHC. Nonetheless, each choice of 
I defines a different theory and, in principle, the distinction could have observable 
consequences. One place that the difference in I shows up is in the magnetic sector. 
Previously we discussed the allowed ’t Hooft lines. However, there is a folk theorem that 
when a quantum field theory is coupled to gravity then any allowed electric or magnetic 
charge has a realisation as a physical state. In other words, particles (or groups of 
particles) should exist with each of the allowed electric and magnetic charges.We’ll 
see in Section 2.8 how magnetic monopoles can arise as dynamical particles in a non- 
Abelian gauge theory. 


The arguments for this are far from rigorous and, for magnetic charges, boil down to 
the fact that an attempt to define an infinitely thin ’t Hooft line in a theory coupled 
to gravity will result in a black hole. If we now let this black hole evaporate, and insist 
that there are no remnants, then it should spit out a particle with the desired magnetic 
charge. 


So what magnetic monopoles are allowed for each choice of I’? First, let’s recall how 
electromagnetism arises from the Standard Model. The electromagnetic charge q of 
any particle is related to the hypercharge Y and the SU(2) charge T? by 


yY E 
=—— +T 
1% 
This gives us the familiar electric charges: for the electron q = —1; for the up quark 
q = +2/3; and for the down quark q = —1/3. 


We denote the magnetic charge under U(1)y as my. As we explained in Section 2.5.2, 
when a Higgs field condenses, many of the magnetically charged states are confined. In 
the Standard Model, those that survive must have 

6my ie 

on z3 mod 2 
The magnetic charge under U(1)y and SU(2) then conspires so that these states are 
blind to the Higgs field. For such states, the resulting magnetic charge under electro- 
magnetism is 


m = 6my 


Now we're in a position to see the how the global structure of the gauge group affects 
the allowed monopole charge. Suppose that we take I’ = 1. Here, the monopoles must 
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obey the Dirac quantisation condition with respect to each gauge group individually. 
This means that my € 27Z, and so the magnetic charge of any particle is quantised 
as m € 127Z. This is six times greater than the magnetic charge envisaged by Dirac. 
Of course, Dirac only knew about the existence of the electron with charge q = 1. The 
quarks, together with the structure of the electroweak force, impose a more stringent 
constraint. 


In contrast, if [ = Zs, more magnetic charges are allowed. This is entirely analogous 
to the situation that we saw in the previous section. The Dirac quantisation condition 
now imposes a single constraint on the combined gauge charges from each factor of the 
gauge group, 


6Y my 


32323 + 22323 — € 6Z 


But this gives us more flexibility. Now we are allowed a magnetic monopole with 
my = t x 2r provided that it also carries a magnetic charge under the other groups, 
zy = 1 and z} = 1. In other words, the Standard Model with T = Zę admits 
the kind of magnetic monopole that Dirac would have expected, with m = 27. Of 
course, this obeys Dirac quantisation with respect to the electron. But it also obeys 
Dirac quantisation with respect to the fractionally charged quarks because it carries a 
compensating non-Abelian magnetic charge. 


2.7 Dynamical Matter 


Until now, we have (mostly) focussed on pure Yang-Mills, without any additional, 
dynamical matter fields. It’s time to remedy this. We will consider coupling either 
scalar fields, ¢, or Dirac spinors w to Yang-Mills. 


Each matter field must transform in a representation R of the gauge group G. In 
the Lagrangian, the information about our chosen representation is often buried in the 
covariant derivative, which reads 


D, = 0, — i Ai T(R) 
where T“(R) are the generators of the Lie algebra in the representation R. For scalar 
fields, the action is 


achat = pes D,,.g'D'o _ V(¢) 


where V(¢) can include both mass terms and ¢* interactions. For spinors, the action 
is 


Sfermion = fes ip py z mpy 
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If we have both scalars and fermions then we can also include Yukawa interactions 
between them. 


Our ultimate goal is to understand the physics described by non-Abelian gauge the- 
ories coupled to matter. What is the spectrum of excitations of these theories? How do 
these excitations interact with other? How does the system respond to various probes 
and sources? In this section, we will start to explore this physics. 


2.7.1 The Beta Function Revisited 


The first question we will ask is: how does the presence of these matter degrees of 
freedom affect the running of the gauge coupling g?(u)? This is simplest to answer for 
massless scalars and fermions. Suppose that we have N, scalars in a representation 
Rs and Ny Dirac fermions in a representation Rr. The 1-loop running of the gauge 
coupling is 


1 1 1 fi 1 4 A? 
=a I(adj) — =NI (R) = Ne l “k 2, 
O Gane E A a) oe (5) m 


This generalises the Yang-Mills beta function (2.56). Recall that the Dynkin indices 
I(R) are group theoretic factors defined by the trace normalisations, 


tr T?(R)T’(R) = I(R)8®” 


and we are working in the convention in which I(F) = į for the fundamental (or 
minimal) representation of any group. 


When a field has mass m, it contributes the running of the coupling only at scales 
u > m, and decouples when pp < m. There is a smooth crossover from one behaviour to 
the other at scales y ~ m, but the details of this will not be needed in these lectures. 


Here we will briefly sketch the derivation of the running of the coupling, following 
Section 2.4.2. We will then look at some of the consequences of this result. 


The Beta Function for Scalars 


If we integrate out a massless, complex scalar field, we get a contribution to the effective 
action for the gauge field given by 


1 
Se [A] = 2g? f dfx iyi + Tr log(—D*) 


But this is something we’ve computed before, since it is the same as the ghost contri- 
bution to the effective action. The only differences are that we get a plus sign instead 
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of a minus sign, because our scalars are the sensible kind that obey spin statistics, 
and that we pick up the relevant trace coefficient [(R), as opposed to I(adj) for the 
ghosts. We can then immediately import our results from Section 2.4.2 to get the scalar 
contribution in (2.83) 


The Beta Function for Fermions 


If we integrate out a massless Dirac fermion, we get a contribution to the effective 
action for the gauge field given by 


1 ; . 
Segl A] = 39 fes trF, F" — log det (ip) 
To compute the determinant, it’s useful to expand as 

det(ip) = det P? (=y D,D, ) 


1 V 1 V 
= det A AD D= W IDD) 


= det a = + v Fw) 
where, to go to the final line, we have used both the Clifford algebra {y", y”} = 26", 


as well as the fact that [D], Dy] = —iF v. The contribution to the effective action is 
then 
i ; 
— log det (ip) = -zT log ( =P te reg Fu) 
—2Tr log(—D?) + [7",7”] Fu terms 


Here the 5 has changed into a 2 after tracing over the spinor indices. We’re left 
having to compute the contribution from the [7", y] Fuu terms. This is very similar in 
spirit to the extra term that we had to compute for the gauge fluctuations in Section 
2.4.2. However, the difference in spin structure means that it differs from the gauge 


contribution by a factor of 1/2. The upshot is that we have 


4 2 
— log det (ip) = -> É — i ai f oni tr [A,(k).A,(—k)] (kYk” — k?54”) log (=x) 
which gives the fermionic contribution to the running of the gauge coupling in (2.83). 
Note that, once again, contributions from the extra spin term (the —4) overwhelm 
the contribution from the kinetic term (the +4/3). But, because we are dealing with 
fermions, there is an overall minus sign. This means that fermions, like scalars, give a 
positive contribution to the beta function. 
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2.7.2 The Infra-Red Phases of QCD-like Theories 


We will start by ignoring the scalars and considering non-Abelian gauge theories coupled 
to fermions. In many ways, this is the most subtle and interesting class of quantum field 
theories and we will devote Sections 3 and 5 to elucidating some of their properties. 
Here we start by giving a brief tour of what is expected from these theories. 


Obviously, there are many gauge groups and representations that we could pick. We 
will restrict ourselves to gauge group SU(N.), where Ne is referred to as the number 
of colours. We will couple to this gauge field Ny Dirac fermions, each transforming 
in the fundamental representation of the gauge group. Here Ny is referred to as the 
number of flavours. We will further take the fermions to be massless, although we 
will comment briefly on what happens as they are given masses. This class of theories 
will be sufficient to exhibit many of the interesting phenomena that we care about. 
Moreover, this class of theories boasts QCD as one of its members (admittedly you 
should relax the massless nature of the quarks just a little bit.) 


At one-loop, the running of the gauge coupling can be read off from (2.83) 


1 1 1 f11N, 2w] (x) 
= = = lo 2.84 
Pu) g aa | ee | we Ce 


These theories exhibit different dynamics depending on the ratio N;/N. 


The Infra-Red Free Phase 


Life is simplest when Ny > 11N,/2. In this case, the contribution to the beta function 
from the matter overwhelms the contribution from the gauge bosons, and the coupling 
g? becomes weaker as we flow towards the infra-red. Such theories are said to be 
infra-red free. This means that, for once, we can trust the classical description at low 
energies, where we have weakly coupled massless gauge bosons and fermions. 


The force between external, probe electric charges takes the form 


1 


ValecirielT ) re r log(rAvy) 


which is Coulombesque, but dressed with the extra log term which comes from the 
running of the gauge coupling. This is the same kind of behaviour that we would get 
in (massless) QED. Meanwhile, the potential between two external magnetic charges 
takes the form 

log(rAuv) 


V magnetic R F 


=.102 -= 


The log in the numerator reflects the fact that magnetic charges experience a force 
proportional to 1/g? rather than 9’. 


When N; = 11N,/2, the one-loop beta function vanishes. To see the fate of the 
theory, we must turn to the two-loop beta function which we discuss below. It will 
turn out that the theory is again infra-red free. 


These theories are ill-defined in the UV, where there is a Landau pole. However, it’s 
quite possible that theories of these types arise as the low-energy limit of other theories. 


The Conformal Window 


Next, consider Ny just below 11.N./2. To understand the behaviour of the theory, we 
can look at the two-loop contribution to the beta function, 


d 
B(g) Si = Bog? + Bag +... 


with the one-loop beta function extracted from (2.84) 


A LIN, | 2Ny 
a= Tas (- 3 2 


We won’t compute the two-loop beta function here, but just state the result: 


N,(N2—1)  10N;N. 
goo N. | 3 


1 34N? 
TER 


(1677)? 


Note that 6, > 0 as long as the number of flavours sits in the range Ny > 34N3/(13N?2— 
3). But o < 0 provided Ny < 11N,/2 and so we can play the one-loop beta function 
against the two-loop beta function, to find a non-trivial fixed point of the RG flow, at 
which 3(g,) = 0. This is given by 


Importantly, for N;/N. = 11/2—e, with € small, we have g? < 1 and the analysis above 
can be trusted. We learn that the low-energy physics is described by a weakly coupled 
field theory which, as a fixed point of RG, is invariant under scale transformations. 
This is known as the Banks-Zaks fixed point. There is a general expectation (although 
not yet a complete proof) that relativistic theories in d = 3+1 which are scale invariant 
are also invariant under a larger conformal symmetry. 
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At any such fixed point, the scale invariance is enough to ensure that both external 
magnetic and electric probes experience a Coulomb force 


1 
m 
Such a phase could be described as a non-Abelian Coulomb phase, comprised of massless 
gluons and fermions. 


What happens if we now lower Ny with fixed N.? The formal result above says 
that the fixed point remains (at least until Ny ~ 34N3/(13N?2 — 3) but the value of 
the coupling g? gets larger so that we can no longer trust the analysis. In general, we 
expect there to be a conformal fixed point for 


11N. 
N, < Ny < 


(2.85) 


for some critical value N,. This range of Ny is referred to as the conformal window. 
The obvious question is: what is the value of N,? 


We don’t currently know the answer to this question. At the lower end of the 
conformal window, the theory is necessarily strongly coupled which makes it difficult 
to get a handle on the physics. There is evidence from numerical work that when 
Ne = 3 (which is the case for QCD) then the lower end of the conformal window sits 
somewhere in the window N, € [8,12], and probably closer to the middle than the 
edges. One would also expect the conformal to scale with N., so one could guess that 
N, ~ 3N, to 4N,. There are various arguments that give values of M, in this range, 
but none of them are particularly trustworthy. 


We’ve seen that there are a set of conformal fixed point, labelled by N. and Ny in 
the range (2.85). We met such fixed points before in the course on Statistical Field 
Theory. In that context, we came across the powerful idea of universality: many 
different ultra-violet theories all flow to the same fixed point. This is responsible for 
the observation that all gases, regardless of their microscopic make-up, have exactly 
the same divergence in the heat capacity at their critical point. We could ask: is there 
a form of universality in gauge theories? In other words, can we write down two gauge 
theories which look very different in the ultra-violet, but nonetheless flow to the same 
infra-red fixed point? 


We don’t yet know of any examples of such universality in the QCD-like gauge theo- 
ries that we discuss in these lectures, although this is most likely due to our ignorance. 
However, such examples are known in supersymmetric theories, which consist of gauge 
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fields, scalars and fermions interacting with specific couplings. In that context, it is 
known that supersymmetric SU (Ne) gauge theories coupled to Ny fundamental flavours 
flows to the same fixed point as SU(N; — N.) gauge theory coupled to Ny flavours. 
(The latter flavours should also be a coupled to a bunch of gauge neutral fields.) Fur- 
thermore, the two descriptions can be identified as electric and magnetic variables for 
the system. This phenomenon is known as Seiberg duality. However, it is a topic for a 
different course. 


Confinement and Chiral Symmetry Breaking 


What happens when Ns < N, and we are no longer in the conformal window? The 
expectation is that for Ny < N, the coupling is once again strong enough to lead to 
confinement, in the sense that all finite energy excitations are gauge singlets. 


Most of the degrees of freedom will become gapped, with a mass that is set paramet- 
rically by Agcp = pe! 2609°(#) However, there do remain some massless modes. These 
occur because of the formation of a vacuum condensate 


(bipi) ~ dij i,j =1,..., Nf 


This spontaneously breaks the global symmetry of the model, known as the chiral 
symmetry. The result is once again a gapless phase, but now with the massless fields 
arising as Goldstone bosons. We will have a lot to say about this phase. We will say it 
in Section 5. 


For pure Yang-Mills, we saw in Section 2.5 that a Wilson line, W [C] = tr P exp (i $ A) 
in the fundamental representation provides an order parameter for the confining phase, 
with the area law, (W[C]) ~ e774, the signature of confinement. However, in the pres- 
ence of dynamical, charged fundamental matter — whether fermions or scalars — this 
criterion is no longer useful. The problem is that, for a sufficiently long flux tube, it 
is energetically preferable to break the string by producing a particle-anti-particle pair 
from the vacuum. If the flux tube has tension o and the particles have mass m, this 
will occur when the length exceeds L > 2m/o. For large loops, we therefore expect 
(W[C]) ~ e- #4. This is the same behaviour that we previously argued for in the Higgs 
phase. To see how they are related, we next turn to theories with scalars. 


2.7.3 The Higgs vs Confining Phase 


We now consider scalars. These can do something novel: they can condense and spon- 
taneously break the gauge symmetry. This is the Higgs phase. 
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Consider an SU(N.) gauge theory with N, scalar fields transforming in the funda- 
mental representation. If the scalars are massless, then the gauge coupling runs as 


1 1 1 f11N. N, A2 
2 et aoe 2 = log on 
gu) gò (47) 3 6 u 


and, correspondingly, the coefficient of the one-loop beta function is 


peace! 11N. Ny 
als te) 


For N, < 22N., the coupling becomes strong at an infra-red scale, Agcp = Ayye” 28090 
It is thought that the theory confines and develops a gap at this scale. We expect no 
massless excitations to survive. 


What now happens if we give a mass m? to the scalars? For m? > 0, we expect these 
to shift the spectrum of the theory, but not qualitatively change the physics. Indeed, 
for m? > Abon: we can essentially ignore the scalars at low-energies and where we 
revert to pure Yang-Mills. The real interest comes when we have m? < 0 so that the 
scalar condense. What happens then? 


Suppose that we take m? « —Ndcp- This means that the scalars condense at a 
scale where the theory is still weakly coupled, g?(\m|) < 1, and we can trust our 
semi-classical analysis. If we have enough scalars to fully Higgs the gauge symmetry 
(Ns > N.—1 will do the trick), then all the gauge bosons and scalars again become 
massive. 


It would seem that the Higgs mechanism and confinement are two rather differ- 
ent ways to give a mass to the gauge bosons. In particular, the Higgs mechanism is 
something that we can understand in a straightforward way at weak coupling while 
confinement is shrouded in strongly coupled mystery. Intuitively, we may feel that the 
Higgs phase is not the same as the confining phase. But are they really different? 


The sharp way to ask this question is: does the theory undergo a phase transition 
as we vary m? from positive to negative? We usually argue for the existence of a 
phase transition by exhibiting an order parameter which has different behaviour in the 
two phases. For pure Yang-Mills, the signature for confinement is the area law for 
the Wilson loop. But, as we argued above, in the presence of dynamical fundamental 
matter the confining string can break, and the area law goes over to a perimeter law. 
But this is the expected behaviour in the Higgs phase. In the absence of an order 
parameter to distinguish between the confining and Higgs phases, it seems plausible 
that they are actually the same, and one can vary smoothly from one phase to another. 
To illustrate this, we turn to an example. 
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An Example: SU(2) with Fundamental Matter 


Consider SU (2) gauge theory with a single scalar ¢ in the fundamental representation. 
For good measure, we’ll also throw in a single fermion w, also in the fundamental 
representation. We take the action to be 


1 À n = 
= J d'e — 5 ote FH Fw + [Dydl? — FOO — 0)? A i Dod + maby 


Note that it’s not possible to build a gauge invariant Yukawa interaction with the 
matter content available. We will look at how the spectrum changes as we vary from 
v? from positive to negative. 


Higgs Phase, v? > 0: When v? > Agcp we can treat the action semi-classically. To 


read off the spectrum in the Higgs phase, it is simplest to work in unitary gauge in 
which the vacuum expectation value takes the form (¢) = (v,0). We can further use 
the gauge symmetry to focus on fluctuations of the form ¢ = (v + d, 0) with $ ER. 
You can think of the other components of @ as being eaten by the Higgs mechanism to 
give mass to the gauge bosons. The upshot is that we have particles of spin 0,1/2 and 
1, given by 


e A single, massive, real scalar œ. 


e Two Dirac fermions Yp; = (Y1, Y2). Since the SU(2) gauge symmetry is broken, 
these no longer should be thought of as living in a doublet. As we vary the mass 
m € R, there is a point at which the fermions become massless. (Classically, this 
happen at m = 0 of course.) 


e Three massive spin 1 W-bosons A“, with a = 1, 2,3 labelling the generators of 


su(2). i 


2 > 0 and does not 


condense. Now we expect to be in the confining phase, in the sense that only gauge 


Confining Phase, v? < 0: When v? < 0, the scalar has mass m 


singlets have finite energy. We can list the simplest such states: we will see that they 
are in one-to-one correspondence with the spectrum in the Higgs phase 


e A single, real scalar 6'¢. This is expected to be a massive excitation. If we were to 
evaluate this in the Higgs phase then, in unitary gauge, we have ¢'¢ = v?+vud+... 
and so the quadratic operator corresponds to the single particle excitation ¢, plus 
corrections. 


There are further scalar operators that we can construct, including tr Fuy F”” and 
ww. These have the same quantum numbers as ¢'¢ and are expected to mix with 
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it. In the confining phase, the lightest spin 0 excitation is presumably created by 
some combination of these. 


e Two Dirac fermions. The first is Y4 = ¢'W. The second comes from using the & 
invariant tensor of SU(2), which allows us to build Y = €6;w,;. If we expand 
these operators in unitary gauge in the Higgs phase, we have VW; = vy +... and 
Wo = vpo +... 


It’s now less obvious that each of these fermions becomes massless for some value 
of m € R, but it remains plausible. Indeed, one can show that this does occur. (A 
modern perspective is that the fermionic excitation is in a different topological 
phase for m >> 0 and m < 0, ensuring a gapless mode as we vary the mass 
between the two.) 


e Finally, we come to the spectrum of spin 1 excitations. Since we want these to 
be associated to gauge fields, we might be tempted to consider gauge invariant 
operators such as tr F#” Fv, but this corresponds to a scalar glueball. Instead, we 
can construct three gauge invariant, spin 1 operators. We have the real operator 
i¢'D,,, and the complex operator é'¢;(D,ġ;). In unitary gauge, these become 
v? A? and v?(Aj, + iA%) respectively. 


This is a strongly coupled theory, so there may well be a slew of further bound states 
and these presumably differ between the Higgs and confining phases. Nonetheless, the 
matching of the spectrum suggests that we can smoothy continue from one phase to 
the other without any discontinuity. We conclude that, for this example, the Higgs and 
confining phases are actually the same phase. 


Another Example: SU(2) with an Adjoint Scalar 


It’s worth comparing what happened above with a slightly different theory in which we 
can distinguish between the two phases. We’ll again take SU(2), but this time with 
an adjoint scalar field ¢. We’ll also throw in a fermion Y, but we’ll keep this in the 
fundamental representation. The action is now 


1 À PE a > = 
a f de — gatt (FY Fw + (Dud)’) — 7 G g= 5) + ih Dy + Ned + maby 
where we’ve now also included a Yukawa coupling between the scalar and fermion. 


Once again, we can look at whether there is a phase transition as we vary v7. For 
v? < 0, the scalar field is massive and we expect the theory to be gapped and confine. 
Importantly, in this phase the spectrum contains only bosonic excitations. There are 
no fermions because it’s not possible to construct a gauge invariant fermionic operator. 
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In contrast, when v? > 0 the scalar field will get an expectation value, breaking the 
gauge group SU(2) — U(1), resulting in a gapless photon. There are also now two 
fermionic excitations which carry charge +3. The spectrum now looks very different 
from the confining phase. 


Clearly in this case the Higgs and confining phases are different. Yet, because we 
have fermions in the fundamental representation, we will still have dynamical breaking 
of the flux tube and so fundamental Wilson loop W[C] does not provide an order 
parameter for confinement. Nonetheless, the existence of finite energy states which 
transform under the Zz centre of SU(2) — which here coincides with (—1)", with F the 
fermion number — provides a diagnostic for the phase. 


2.8 °t Hooft-Polyakov Monopoles 


Coupling dynamical, electrically charged particles to Yang-Mills theory is straightfor- 
ward, although understanding their dynamics may not be. But what about dynamical 
magnetically charged particles? 


For Abelian gauge theories, this isn’t possible: if you want to include Dirac monopoles 
in your theory then you have to put them in by hand. But for non-Abelian gauge 
theories, it is a wonderful and remarkable fact that, with the right matter content, 
magnetic monopoles come along for free: they are solitons in the theory. 


Magnetic monopoles appear whenever we have a non-Abelian gauge theory, broken 
to its Cartan subalgebra by an adjoint Higgs field. The simplest example is SU(2) 
gauge theory coupled to a single adjoint scalar ø. As explained previously, we use 
the convention in which ¢ sits in the Lie algebra, so ¢ = ¢*T*%. For G = SU(2) the 
generators are T° = o° /2, with o° the Pauli matrices. We take the action to be 


2 
S = jee — I ypwp + LD ¢) — as (« p — =) (2.86) 
2g? “Y g? ġ 4 2 l 


Note that we’ve rescaled the scalar ¢ so that it too has a 1/g? sitting in front of it. 


The potential is positive definite. The vacuum of the theory has constant expectation 
value (¢). Up to a gauge transformation, we can take 


(¢) = ; (| i (2.87) 


This breaks the gauge group SU(2) — U(1). The spectrum consists of a massless 
photon — which, in this gauge, sits in the T’ part of the gauge group — together with 
massive W-bosons and a massive scalar. 
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There are, however, more interesting possibilities for the expectation value. Any finite 
energy excitation must approach a configuration with vanishing potential at spatial 
infinity. Such configurations obey tr ¢? — v? as |x| 4 oo. Decomposing the Higgs field 
into the generators of the Lie algebra, ¢ = 6°7T%, a = 1,2,3, the requirement that the 
potential vanishes defines a sphere in field space, 


S? := fo ob? = v) (2.88) 


We see that for any finite energy configuration, we must specify a map which tells us 
the behaviour of the Higgs field asymptotically, 


o: S2 m S? 


The fact that these maps fall into disjoint classes should no longer be a surprise: it’s 
the same idea that we met in Sections 2.2 and 2.3 when discussing theta vacua and 
instantons, and again in Section 2.5.2 when discussing vortices. This time the relevant 
homotopy group is 


(S = Z 
Given a configuration @, the winding number is computed by 


1 


y= —— 
87rv2 


f AS; F eapo € Z (2.89) 
s2, 


In a sector with v # 0, the gauge symmetry breaking remains SU(2) — U(1). The 
difference is that now the unbroken U(1) C SU(2) changes as we move around the 
asymptotic S2.. 


The next step is to notice that if the Higgs field has winding v 4 0, then we must also 
turn on a compensating gauge field. The argument is the same as the one we saw for 
vortex strings. Suppose that we try to set A; = 0. Then, the covariant derivatives are 
simply ordinary derivatives and, asymptotically, we have (D;d)? = (0;¢)? ~ (09d)?/r?, 
with Og denoting the (necessarily non-vanishing) variation as we move around the angu- 
lar directions of the asymptotic S2,. The energy of the configuration will then include 


2 
E = k pe tr (0:6)? ~ = | eo f ar r? ga 
g g“ Js2, r 


This integral diverges linearly. We learn that if we genuinely want a finite energy 


the term 


excitation in which the Higgs field winds asymptotically then we must also turn on the 
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gauge fields A; to cancel the 1/r asymptotic fall-off of the angular gradient terms, and 
ensure that Ded — 0 as r — oo. We want to solve 


Dib = 8-4, d] 40 + A> lo ag + +6 


Here the first term works to cancel the fall-off from 0;¢. To see this, you will need to 
use the fact that tr o? + v?, and so tr(¢0;¢) — 0, as well as the su(2) commutation 
relations. The second term in A; does not contribute to the covariant derivative Dio. 
The function a; is the surviving, massless U(1) photon which can be written in a gauge 
invariant way as 


ap, = “ir(GA,) (2.90) 


We can also compute the asymptotic form of the field strength. The same kinds of 
manipulations above show that this lies in the same direction in the Lie algebra as ¢, 


1 
Fij = 5. O 
with 
1 
Fig = far a (¢ lio, 0;4]) 


Here fi; = Oja; — Oja; is the Abelian field strength that we may have naively expected. 
But we see that there is an extra term, and this brings a happy surprise, since it 
contributes to the magnetic charge m of the U(1) field strength. This is given by 

pee oe ake = fee Ek 20320 Ong = 4nv (2.91) 

— i zs jk = 23 a j k = : 

with v the winding number defined in (2.89). We learn that any finite energy config- 
uration in which the Higgs field winds asymptotically necessarily carries a magnetic 
charge under the unbroken U(1) C SU(2). This object is a soliton and goes by the 
name of the t Hooft-Polyakov monopole. 


The topological considerations above have led us to a quantised magnetic charge. 
However, at first glance, the single ’t Hooft-Polyakov monopole with v = 1 seems 
to have twice the charge required by Dirac quantisation (1.3), since the W-bosons 
have electric charge q = 1. But there is nothing to stop us including matter in the 
fundamental representation of SU(2) with q = +3, with respect to which the ’t Hooft- 
Polyakov monopole has the minimum allowed charge. 


== 


2.8.1 Monopole Solutions 


We have not yet solved the Yang-Mills-Higgs equations of motion with a given magnetic 
charge. In general, no static solutions are expected to exist with winding v > 1, because 
magnetically charged objects typically repel each other. For this reason, we restrict 
attention to the configurations with winding v = +1. 


We can write an ansatz for a scalar field with winding n = 1, 


T? 0 r=>0 


p= zahlr) with h(r) > 


Ur r — œ 


This is the so-called “hedgehog” ansatz, since the direction of the scalar field ¢ = ọ°T® 
is correlated with the direction x° in space. Just like a hedgehog. In particular, this 
means that the SU(2) gauge action on ¢* and the SO(3) rotational symmetry on x° are 
locked, so that only the diagonal combination are preserved by such configurations. We 
can make a corresponding ansatz for the gauge field which preserves the same diagonal 
SO(3), 


ri 1 r>0 


A? = —€q;; 5 |1 — k(r with k(r) > 
a jal (r)] (r) t EE 


We can now insert this ansatz into the equations of motion 
DF w —i[¢,D,d] =0 and D?¢ = 297X(tr¢? — v)o (2.92) 


This results in coupled, ordinary differential equations for h(r) and k(r). In general, 
they cannot be solved analytically, but it is not difficult to find numerical solutions for 
the minimal ’t Hooft-Polyakov monopole. 


BPS Monopoles 


Something special happens when we set A = 0 in (2.86). Here the scalar potential 
vanishes which means that, at least classically, we can pick any expectation value v 
for the scalar. The choice of v should be thought of as extra information needed to 
define the vacuum of the theory. (In the quantum theory, one typically expects to 
generate a potential for ¢. The exception to this is in supersymmetric theories, where 
cancellations ensure that the quantum potential also vanishes. Indeed, the monopole 
that we describe below have a nice interplay with supersymmetry, although this is 
beyond the scope of these lectures.) 
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When the potential vanishes, it is possible to use the Bogomolnyi trick to rewrite 
the energy functional. In terms of the non-Abelian magnetic field B; = — sign Fjk, the 
energy of a static configuration with vanishing electric field is 


E 


5 f d’x tr (B? + (Did)*) 


1 
g 


IV 


g 


where, to get to the last line, we have discarded the positive definite term and integrated 
by parts, invoking the Bianchi identity D;B; = 0. We recognise the final expression 
as the magnetic charge. We find that the energy of a configuration is bounded by the 
magnetic charge 


2v|m| 
g? 


E> (2.93) 
A configuration which saturates this bound is guaranteed to solve the full equations of 
motion. This is achieved if we solve the first order Bogomolnyi equations 


B; = +D;¢ġ (2.94) 


with the + sign corresponding to monopoles (with m > 0) and anti-monopoles (with 
m < 0) respectively. It can be checked that solutions to (2.94) do indeed solve the full 
equations of motion (2.92) when À = 0. 


Solutions to (2.94) have a number of interesting properties. First, it turns out that 
the equations of motion for a single monopole have a simple analytic solution, 


ur 

Le) = th -1 d k(r) = —— 
(r) = ur coth(ur) an (r) aie 
This was first discovered by Prasad and Sommerfield. In general, solutions to (2.94) 


are referred to as BPS monopoles, with Bogomolnyi’s name added as well. 


A warning on terminology: these BPS monopoles have rather special properties 
in the context of supersymmetric theories where they live in short multiplets of the 
supersymmetry algebra. The term “BPS” has since been co-opted and these days is 
much more likely to refer to some kind of protected object in supersymmetry, often one 
that has nothing to do with the monopole. 
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The Bogomolnyi equations (2.94) also have solutions corresponding to monopoles 
with higher magnetic charges. These solutions include configurations that look like far 
separated single charge monopoles. This is mildly surprising. Our earlier intuition told 
us that such solutions should not exist because the repulsive force between magnetically 
charged particles would ensure that the energy could be lowered by moving them further 
apart. That intuition breaks down in the Bogomolnyi limit because we have a new 
massless particle — the scalar ¢ — and this gives rise to a compensating attractive 
force between monopoles, one which precisely cancels the magnetic repulsion. You can 
learn much more about the properties of these solutions, and the role they play in 
supersymmetric theories, in the lectures on Solitons. 


Monopoles in Other Gauge Groups 


It is fairly straightforward to extend the discussion above the other gauge groups G. 
We again couple a scalar field ¢ in the adjoint representation and give it an expectation 
value that breaks G — H where H = U(1)’, with r is the rank of the gauge group. 


Given an expectation value for ¢, we can always rotate it by acting with G. However, 
by definition, H leaves the scalar untouched which means that in configurations are 
now classified by maps from S2, into the space G/H. (In our previous discussion we 
had G/H = SU(2)/U(1) = S? which coincides with what we found in (2.88).) A result 
in homotopy theory tells us that, for simply connected G, 


He(G/H) = (H) = 2" 


We learn that the ’t Hooft-Polyakov monopoles are labelled by an r-dimensional mag- 
netic charge vector m. This agrees with our analysis of ’t Hooft lines in Section 
2.6. A closer look reveals that the ’t Hooft-Polyakov monopoles have magnetic charge 
m E 27 Aco—root (g), as required by the Goddard-Nuyts-Olive quantisation (2.80). 


2.8.2 The Witten Effect Again 


We saw in Section 1.2.3 that, in the presence of a 0 term, a Dirac monopole picks up 
an electric charge. As we now show this phenomenon, known as the Witten effect, also 
occurs for the ’t Hooft-Polyakov monopole. 


To see this, we simply need to be careful in identifying the electric charge operator 
in the presence of a monopole. We saw in (2.90) that the unbroken U(1) c SU(2) is 
determined by the ø. The corresponding global gauge transformation is 


1 
ÔA, = awe 


—114- 


But we already did the hard work and computed the Noether charge Q associated to 
such a gauge transformation in (2.30), where we saw that it picks up a contribution 
from the @ term (2.22); we have 

1 6g? 1 

Q= fer tr ( ut BN Dg 

g? 8r? v 
In our earlier discussion, around equation (2.30), we were working inthe vacuum and 
could discard the contribution from 0. However, in the presence of a monopole both 
terms contribute. The total electric charge Q is now 
0g?m 
8r? 


Q=q+ (2.95) 


with the naive electric charge q defined as 
q= L fér tr Dig E; 
and the magnetic charge m defined, as in (2.91), by 
m= L fë tr Dig Bi 


We see that the theta term does indeed turn the monopole into a dyon. This agrees 
with our previous discussion of the Witten effect (1.19), with the seemingly different 
factor of 2 arising because, as explained above, q is quantised in units of 1/2 in the 
non-Abelian gauge theory. 


2.9 Further Reading 


Trinity College, Cambridge boasts many great scientific achievements. The discovery 
of Yang-Mills theory is not among the most celebrated. Nonetheless, in January 1954 
a graduate student at Trinity named Ronald Shaw wrote down what we now refer to 
as the Yang-Mills equations. Aware that the theory describes massless particles, which 
appear to have no place in Nature, Shaw was convinced by his supervisor, Abdus Salam, 
that the result was not worth publishing. It appears only as a chapter of his thesis 
[181]. 


Across the Atlantic, in Brookhaven national laboratory, two office mates did not 
make the same mistake. C. N. Yang and Robert Mills constructed the equations which 
now bear their name [232]. It seems likely that that they got the result slightly before 
Shaw, although the paper only appeared afterwards. Their original motivation now 
seems somewhat misguided: their paper suggests that global symmetries of quantum 
field theory — specifically SU(2) isospin — are not consistent with locality. They write 
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“It seems that this [global symmetry] is not consistent with the localized 
field concept that underlies the usual physical theories” 


From this slightly shaky start, one of the great discoveries of 20th century physics 
emerged, 


In those early days, the role played by Yang-Mills theory was, to say the least, 
confusing. Yang gave a famous seminar in Princeton in which Pauli complained so 
vociferously about the existence of massless particles that Yang refused to go on with 
the talk and had to be coaxed back to the blackboard by Oppenheimer. (Pauli had 
a headstart here: in 1953 he did a Kaluza-Klein reduction on S°, realising an SU(2) 
gauge theory but discarding it because of the massless particle [151]. A similar result 
had been obtained earlier by Klein [122].) 


It took a decade to realise that the gauge bosons could get a mass from the Higgs 
mechanism, and a further decade to realise that the massless particles were never really 
there anyway: they are an artefact of the classical theory and gain a mass automatically 
when A # 0. Below is a broad brush description of this history. A collection of 
reminiscences, “50 Years of Yang-Mills” [108], contains articles by a number of the 
major characters in this story. 


Asymptotic Freedom 


As the 1970s began, quantum field theory was not in fashion. Fundamental laws of 
physics, written in the language of field theory, languished in the literature, unloved 
and uncited [77, 205]. The cool kids were playing with bootstraps. 


The discovery of asymptotic freedom was one of the first results that brought field 
theory firmly into the mainstream. The discovery has its origins in the deep inelastic 
scattering experiments performed in SLAC in the late 1960s. Bjorken [19] and subse- 
quently Feynman [56] realised that the experiments could be interpreted in terms of 
the momentum distribution of constituents of the proton. But this interpretation held 
only if the interactions between these constituents became increasingly weak at high 
energies. Feynman referred to the constituents as “partons” rather than “quarks” [57]. 
It is unclear whether this was because he wanted to allow for the possibility of other 
constituents, say gluons, or simply because he wanted to antagonise Gell-Mann. 


In Princeton, David Gross set out to show that no field theory could exhibit asymp- 
totic freedom [86]. Having ruled out field theories based on scalars and fermions, all 
that was left was Yang-Mills. He attacked this problem with his new graduate student 
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Frank Wilczek. The minus signs took some getting right, but by April 1973 they re- 
alised that they had an asymptotically free theory on their hands [83] and were keenly 
aware of its importance. 


Meanwhile, in Harvard, Sidney Coleman was interested in the same problem. He 
asked his graduate student Erick Weinberg to do the calculation but, content that 
he had enough for his thesis, Erick passed it on to another graduate student, David 
Politzer. Politzer finished his calculation at the same time as the Princeton team [156]. 
In 2004, Gross, Politzer and Wilczek were awarded the Nobel prize. Politzer’s Nobel 
lecture contains an interesting, and very human, account of the discovery [157]. 


In fact, both American teams had been scooped. In June 1972, at a conference in 
Marseilles, a Dutch graduate student named Gerard ’t Hooft sat in a talk by Symanzik 
on the SLAC experiments and their relation to asymptotic freedom. After the talk, ’t 
Hooft announced that Yang-Mills theory is asymptotically free. Symanzik encouraged 
him to publish this immediately but, like Shaw 20 years earlier, ’t Hooft decided against 
it. His concern was that Yang-Mills theory could not be relevant for the strong force 
because it had no mechanism for the confinement of quarks [107]. 


The failure to publish did not hurt ’t Hooft’s career. By that stage he had already 
shown that Yang-Mills was renormalisable, a fact which played a large role in bringing 
the theory out of obscurity [93, 94, 95]. This was enough for him to be awarded his 
PhD [96]. It was also enough for him to be awarded the 1999 Nobel prize, together 
with his advisor Veltman. We will be seeing much more of the work of ’t Hooft later 
in these lectures. 


The analogy between asymptotic freedom and paramagnetism was made by N. K. 
Nielsen [148], although the author gives private credit to ’t Hooft. In these lectures, we 
computed the one-loop beta function using the background field method. This method 
was apparently introduced by (of course) ’t Hooft in lectures which I haven’t managed 
to get hold of. It first appears in published form in a paper by Larry Abbott [1] (now 
a prominent theoretical neuroscientist) and is covered in the textbook by Peskin and 
Schroeder [154]. 


Confinement and the Mass Gap 


Asymptotic freedom gave a dynamical reason to believe that Yang-Mills was likely 
responsible for the strong force. Earlier arguments that quarks should have three 
colour degrees of freedom meant that attention quickly focussed on the gauge group 
SU (8) [84, 65]. But the infra-red puzzles still remained. Why are the massless particles 
predicted by Yang-Mills not seen? Why are individual quarks not seen? 
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Here things were murky. Was the SU(3) gauge group broken by a scalar field? Or 
was it broken by some internal dynamics? Or perhaps the gauge group was actually 
unbroken but the flow to strong coupling does something strange. This latter possibility 
was mooted in a number of papers [84, 207, 208, 65]. This from Gross and Wilczek in 
1973, 


“Another possibility is that the gauge symmetry is exact. At first sight 
this would appear ridiculous since it would imply the existence of massless, 
strongly coupled vector mesons. However, in asymptotically free theories 
these naive expectations might be wrong. There may be little connection 
between the ”free” Lagrangian and the spectrum of states.” 


This idea was slowly adopted over the subsequent year. The idea of dimensional 
transmutation, in which dimensionless constants combine with the cut-off to give the 
a physical scale, was known from the 1973 work of Coleman and E. Weinberg [27]. 
Although they didn’t work with Yang-Mills, their general mechanism removed the most 
obvious hurdle for a scale-invariant theory to develop a gap. A number of dynamical 
explanations were mooted for confinement, but the clearest came only in 1974 with 
Wilson’s development of lattice gauge theory [214]. This paper also introduced what 
we now call the Wilson line. We will discuss the lattice approach to confinement in 
some detail in Section 4. 


The flurry of excitement surrounding these developments also serves to highlight the 
underlying confusion, as some of the great scientists of the 20th century clamoured 
to disown their best work. For example, in an immediate response to the discovery 
of asymptotic freedom, and six years after his construction of the electroweak theory 
[205], Steven Weinberg writes [208] 


“Of course, these very general results will become really interesting only 
when we have some specific gauge model of the weak and electromagnetic 
interactions which can be taken seriously as a possible description of the 
real world. This we do not yet have.” 


Not to be outdone, in the same year Gell-Mann offers [65] 
“We do not accept theories in which quarks are real, observable particles.” 


It’s not easy doing physics. 
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Semi-Classical Yang-Mills 


In these lectures, we first described the classical and semi-classical structure of Yang- 
Mills theory, and only then turned to the quantum behaviour. This is the logical way 
through the subject. It is not the historical way. 


Our understanding of the classical vacuum structure of Yang-Mills theory started 
in 1975, when Belavin, Polyakov, Schwartz and Tyupkin discovered the Yang-Mills 
instanton [14]. Back then, Physical Review refused to entertain the name “instanton”, 
so they were referred to in print as “pseudoparticles” . 


’t Hooft was the first to perform detailed instanton calculations [101, 102], including 
the measure K(p) that we swept under the carpet in Section 2.3.3. Among other 
things, his work clearly showed that physical observables depend on the theta angle. 
Motivated by this result, Jackiw and Rebbi [113], and independently Callan, Dashen 
and Gross [23], understood the semi-classical vacuum structure of Yang-Mills that we 
saw in Section 2.2. 


Jackiw’s lectures [115] give a very clear discussion of the theta angle and were the 
basis for the discussion here. Reviews covering a number of different properties of 
instantons can be found in [182, 191, 197]. 


Magnetic Yang-Mills 


The magnetic sector of Yang-Mills theory was part of the story almost from the begin- 
ning. Monopoles in SU(2) gauge theories were independently discovered by ’t Hooft 
[99] and Polyakov [158] in 1974. The extension to general gauge groups was given in 
1977 by Goddard, Nuyts and Olive [80]. This paper includes the GNO quantisation 
condition that we met in our discussion of ’t Hooft line, and offers some prescient sug- 
gestions on the role of duality in exchanging gauge groups. (These same ideas rear 
their heads in mathematics in the Langlands program.) 


Bogomolnyi’s Bogomolnyi trick was introduced in [20]. Prasad and Sommerfeld then 
solved the resulting equations of motion for the monopole [162], and the initials BPS 
are now engraved on all manner of supersymmetric objects which have nothing to do 
with monopoles. (A more appropriate name for BPS states would be Witten-Olive 
states [217].) Finally, Witten’s Witten effect was introduced in [216]. Excellent reviews 
of ’t Hooft-Polyakov monopoles, both with focus on the richer BPS sector, can be found 
in Harvey’s lecture notes [89] and in Manton and Sutcliffe’s book [133]. There are also 
some TASI lectures [191]. 


== 


The Nielsen-Olesen vortex was introduced in 1973 [145]. Their motivation came 
from string theory, rather than field theory. The fact that such strings would confine 
magnetic monopoles was pointed out by Nambu [142] and the idea that this is a useful 
analogy for quark confinement, viewed in dual variables, was made some years later by 
Mandelstam [130] and ’t Hooft [100]. 


The ’t Hooft line as a magnetic probe of gauge theories was introduced in [103]. This 
paper also emphasises the importance of the global structure of the gauge group. A 
more modern perspective on line operators was given by Kapustin [120]. A very clear 
discussion of the electric and magnetic line operators allowed in different gauge groups, 
and the way this ties in with the theta angle, can be found in [4]. 


Towards the end of the 1970s, attention began to focus on more general questions of 
the phases of non-Abelian gauge theories [103, 104]. The distinction, or lack thereof, 
between Higgs and confining phases when matter transforms in the fundamental of the 
gauge group was discussed by Fradkin and Shenker [63] and by Banks and Rabinovici 
[9]; both rely heavily on the lattice. The Banks-Zaks fixed point, and its implications 
for the conformal window, was pointed out somewhat later in 1982 [10]. 
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3. Anomalies 


We learn as undergraduates that particles come in two types: bosons and fermions. 
Of these, the bosons are the more straightforward since they come back to themselves 
upon a 27 rotation. Fermions, however, return with a minus sign, a fact which has 
always endowed them with something of an air of mystery. In this section and the next, 
we will begin to learn a little more about the structure of fermions, and we will see the 
interesting and subtle phenomena that arise when fermions are coupled to gauge fields. 


Our interest in this chapter lies with a phenomenon known as a quantum anomaly. 
In fact, there are a number of related phenomena that carry this name. For example, 
later, in Section 3.5, we will describe the so-called ‘t Hooft anomaly which can be viewed 
as an obstruction to gauging a global symmetry and, in many ways, this is the key idea 
that underlies this chapter. However, rather than jump straight in with this, we will 
instead build up more slowly. In doing so, our first introduction to an anomaly will be 
slightly different: we will start by describing an anomaly as a symmetry of the classical 
theory which does not survive to the quantum theory. 


Stated in this way, we have already seen an example of an anomaly: classical Yang- 
Mills theory is scale invariant, but this is ruined in the quantum theory by the running 
of the coupling constant and the emergence of the scale Agcp. In this section we will 
primarily be interested in anomalies associated to fermions. We will learn that these 
are intimately connected to various topological aspects of gauge theories and give rise 
to some surprising and beautiful phenomena. 


3.1 The Chiral Anomaly: Building Some Intuition 


Later in this chapter we will describe both the physical intuition and the detailed 
technical calculations that underly the anomaly. But we start here by describing, 
without proof, the key formula. 


A particularly simple example of an anomaly arises when we have a massless Dirac 
fermion in d = 3+ 1 dimensions, coupled to an electromagnetic gauge field. The action 
for the fermion is 


S = fes ip Dw (3.1) 
If the gauge field is dynamical, we would add to this the Maxwell action. Alternatively, 


we could think of the gauge field as a non-fluctuating background field, something fixed 
and under our control. 


== 


As we know from our first course on Quantum Field Theory, the action (3.1) has two 
global symmetries, corresponding to vector and axial rotations of the fermion. The first 
of these simply rotates the phase of w by a constant, Y > e’*~, with the corresponding 
current 


J" = pyy 


The action (3.1) includes the coupling A,,j” of this current to the background gauge 
field. If we want the action to be invariant under gauge transformations A,, + A„ +0,0 
(and we do!) then its imperative that the current is conserved, so 0,,j" = 0. We’ll see 
more about the interplay between anomalies and gauge symmetries in Section 3.4. 


The other symmetry of (3.1) is the axial rotation, Y% — eV, with associated 
current 


Jh = UV 


In the classical theory, the standard arguments of Noether tells us that „j4 = 0. While 
this is true in the classical theory, it is not true in the quantum theory. Instead, it turns 
out that the divergence of the current is given by 


duja = ee F uw Fe (3.2) 


where F» is the electromagnetic field strength. This is known as the chiral anomaly. (It 
is sometimes called the ABJ anomaly, after Adler, Bell and Jackiw who first discovered 
it.) The anomaly tells us that in the presence of parallel electric and magnetic fields, 
the axial charge density can change. 


Later in this section, we will derive (3.2). In fact, because it’s important, we will 
derive it twice, using different methods. However, it’s easy to get bogged down by 
complicated mathematics in this subject, so we will first try to build some intuition for 
why axial charge is not conserved. 


3.1.1 Massless Fermions in Two Dimensions 


Although our ultimate interest lies in four dimensional fermions (3.1), there is a slightly 
simpler example of the anomaly that arises for a Dirac fermion in d = 1+1 dimensions. 
(We’ll see a lot more about physics in d = 1+ 1 dimensions in Section 7.) The Clifford 
algebra, 


{yp =n pv=0,1 


=22= 


with 7"” = diag(+,1,—1) is satisfied by the two-dimensional Pauli matrices 


v=o and 7! = io? 
The Dirac spinors are then two-component objects, Y. The action for a massless spinor 
is 


S= je iw Ow (3.3) 


Quantisation of this action will give rise to a particle and an anti-particle. Note that, in 
contrast to fermions in d = 3+1 dimensions, these particles have no internal spin. This 
is for the simple reason that there is no spatial rotation group in d = 1 + 1 dimensions. 


We can write the action as 
S= fbx hP ay = | be iO = aN (3.4) 


where 


5 


aŠ = —yy! = iglo? = o? 


The name “y°” is slightly odd in this d = 1 + 1 dimensional context, but it is there 
to remind us that this matrix is analogous to the y° that arises for four dimensional 
fermions. Just like in four-dimensions, we can decompose a massless Dirac fermion into 
chiral constituents, determined by its eigenvalue under 7°. We write 


1 


de = 5 (lt7)d 


With our choice of basis, the components are 


(i) C) 


Written in terms of chiral fermions, the action (3.4) then becomes 


S= f Prixa xet ix ipx (3.5) 


with 0, = 0,+0,. This tells us how to interpret chiral fermions in d = 1+1 dimensions. 


The equation of motion for x+ is ©- x+ = 0 which has the solution x; = x+(t + 2). 
In other words, x+ is a left-moving fermion. In contrast, y- obeys +x- = 0 and is a 
right-moving fermion: y- = y_(t— 2). 
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Only massless Dirac fermions can be decomposed into independent chiral constituents. 
This is clear in d = 1+1 dimensions since massless particles must travel at the speed of 
light, so naturally fall into left-moving and right-moving sectors. If we want to particle 
to sit still, we need to add a mass term which couples the left-moving and right-moving 


fermions: mpy = m(x'_v_ +X x4) 


We won’t run through the full machinery of canonical quan- 
tisation, but the results are straightforward. One finds that 
there are both particles and anti-particles. Right-movers have 
momentum p > 0 and left-movers have p < 0. All excitations 


no} 


have the dispersion relation E = |p]. 


For once, it’s useful to think of this in the Dirac sea language. 
Here we view the states as having energy E = |p|. The vacuum 
configuration consists of filling all negative energy states; these 
are the red states shown in the figure. Those with E > 0 are 


Figure 23: 


unfilled. In the picture we’ve implicitly put the system on a spatial circle, so that the 
momentum states are discrete, but this isn’t necessary for the discussion below. 


The action (3.5) has two global symmetries which rotate the individual phases of 
X+ and y_. Alternatively, in the language of the Dirac fermion these symmetries are 
Y > ey and Y > eth, This means that the number of n_ of left-moving fermions 
and the number n of right-moving fermions is separately conserved. This is referred 
to as a chiral symmetry. 


Naively, we would expect that both n} and n_ continue 
to be conserved if we deform the theory, provided that both 
symmetries are preserved. This means that we could perturb 


the theory in some way which results in a right-moving particle- 


no} 


anti-particle pair being excited as in the picture. (Note that in 
this picture, the hole left in the Dirac sea has momentum p < 0 


which, when viewed as a particle, means that it has momentum 
p> 0 as peas nigbemaving excitation.) However, as long Figure ai: 
as the symmetries remain, we would not expect to be able to 


change a left-moving fermion into a right-moving fermion. 


We will see that this expectation is wrong. One can deform the theory in such a way 
that both symmetries are naively preserved, and yet right-moving fermions can change 
into left-moving fermions. 
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Turning on a Background Electric Field 


To see the anomaly, we need to deform our theory in some way. We do this by turning 
on a background electric field. This means that we replace the action (3.3) with 


= f dx ippy (3.6) 


where D, = ô„—ieA,. Here A, is not a fluctuating, dynamical field: instead it is a fixed 
background field. Notice that the classical action (3.6) remains invariant under the two 
global symmetries and a standard application of Noether’s theorem would suggest that 
n, and n_ are separately conserved. This, it turns out, is not correct. 


To see the problem, we turn on an electric field € for some time t. We choose € > 0 
which means that it points towards the right. Because the particles are charged, the 
electric field will increase the momentum p, and hence the energy F, of all the filled 
states in the Dirac sea: they all get shifted by 


Ap = e£t (3.0) 
Both left and right-movers get shifted by the same amount. The 


net result is the Fermi surface shown in the figure to the right. 
But this is precisely what we thought shouldn’t happen: despite ae 


the presence of the symmetry, we have created left-moving anti- 


© 


particles and right-moving particles! 


We can be a little more precise about the violation of the 
conserved quantity. We denote by p+ the density of right-moving 
fermions and by p_ the density of left-moving fermions. The Figure 25: 
shift in momentum (3.7) then becomes a shift in charge density, 

e€ e€ 
p+ = Pa and p- = = 
where the extra factor of 1/27 comes from the density of states. The total number of 
fermions is conserved (counting, as usual, particles minus anti-particles). This is the 
conservation law that comes from the vector symmetry Y — e’*w: 


p=0 where p= p} +p- 


In contrast, the difference between fermion numbers is not conserved. This is the 
quantity that was supposed to be preserved by the axial symmetry % —> elo a), 


pa = = where p4 = p+ — p- (3.8) 


This is known as the axial anomaly or the chiral anomaly. 
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We seem to have violated Noether’s theorem: the axial symmetry does not give rise 
to a conserved quantity. How could this happen? Looking at the picture of the Dirac 
sea, it’s clear where these extra fermions came from. They came from infinity! It was 
only possible to change left-movers to right-movers because the Dirac sea is infinitely 
deep. If we were to truncate the Dirac sea somewhere, then the excess right-movers 
would be compensated by a depletion of right-moving states at large, negative energy 
and there would be no violation of axial charge. But there is no truncation of the Dirac 
sea and, rather rather like Hilbert’s hotel, the whole chain of right-moving states can 
be shifted up, leaving no empty spaces at the bottom. 


This is interesting! The anomaly arises because of 
the infinite Dirac sea which, in turn, arises because we 
are dealing with continuum quantum field theory with 
an infinite number of states rather than a finite quan- 
tum mechanical system. Ultimately, it is this difference 


that allows for anomalies. 


Figure 26: 


As a useless aside, here is a picture of an actual 
“Hilbert hotel”, originally in Germany, now sadly closed. 
This hotel appears to be best known as a place that Elvis Presley once stayed. To my 
knowledge there exists no photograph that shows the full height of this hotel: you 
should use your imagination. 


3.1.2 Massless Fermions in Four Dimensions 


The discussion above seems very specific to d = 1 + 1 dimensions, where massless 
fermions split into left-movers and right-movers. However, there is an analogous piece 
of physics in d = 3+ 1 dimensions. For this, we must look at massless fermions in 
background electric and magnetic fields. 


First some notation. We take the representation of gamma matrices to be 


> (01 2 Oe 


which obey the Clifford algebra {y",7’} = 2n”” in signature (+ — ——). We also 


introduce 
1 0 
5 - 0.1.2.3 
a EY SE 
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The Dirac fermion is a four-component spinor yw. This can be split into two, two- 
component Weyl spinors Y+ which are eigenvectors of 7°. In components we write 


(3) 


We now couple the spinor to a background electromagnetic field A,,. The action is 


S= i dtz ipPy = | dtr ipi a Dupa + iplo" Dup- (3.10) 


where D, = ð, — ieA, and o” = (1,0°) and 5” = (1,—o"). (Note that we’ve resorted 


to the convention where the electric charge sits inside the covariant derivative.) 


We’ll proceed in steps. We’ll first see how these fermions respond to a background 
magnetic field B. Setting Ap = 0, the equation of motion for the chiral spinor Yt is 


ið, = io Dipa (3.11) 


Once again, we don’t want to run through the whole process of canonical quantisation. 
Instead we’ll cheat and think of this equation in the way that Dirac originally thought 
of the Dirac equation: as a one-particle Schrodinger equation for a particle with spin. 
In this framework, the Hamiltonian is 


H = —io'D; = (p — eA) -0 


The spin of the particle is determined by the operator S = žo. (For massless particles, 
it’s better to refer to this as helicity; we’ll see its interpretation below.) Squaring the 
ijk „k 


Hamiltonian, and using the fact that otot = 64 + icto", we find 


H?’ = (p— eA)? — 2eB-S 


The first term is the Hamiltonian for non-relativistic particles in a magnetic field. (See, 
for example, the lectures on Applications of Quantum Mechanics.) The second term 
leads to a Zeeman splitting between spin states. Let’s choose the magnetic field to lie 
in the z-direction, B = (0,0, B), and work in Landau gauge so A = (0, Bx,0). Then 


we have 


IP =p + (Dy — eBr)” + pz — 2eBS, 


=r = 


Quantisation of motion in the (x, y)-plane leads to the familiar 
Landau levels. Each of these has a large degeneracy: in a region : 
of area A there are eBA/2z states which, in Landau gauge, are — e 
distinguished by the quantum number p,. The resulting energy Sg 


spectrum is — 


v v | Y ne 


F? = eB(2n+1)+p2-—2eBS, withn=0,1,2,... 
Zeeman splitting 

At this point, there’s a rather nice interplay between the ener- 
gies of the Landau levels and the Zeeman splitting. This occurs Figure 27: 
because the eigenvalues of the spin operator S, are +ż. This 
means that the states with S, = +4 in the n = 0 Landau level have precisely zero 
energy E = 0. Such states are, quite reasonably, referred to as zero modes. Meanwhile, 
the n = 0 states with S, = -4 have the same energy as the n = 1 states with S, = +4, 
and so on. Ignoring p,, the resulting energy spectrum is shown in the figure. Note, in 
particular, that the n = 0 Landau level has exactly half the states of the other levels. 


In very high magnetic fields, it is sensible to restrict to the zero modes in the n = 0 
Landau level. As we’ve seen, these have spin +. This means that they take the form 
X+ (x, Y; 4, t) 

plz, y; zt) = ( 0 


where the notation is there to highlight that these states have a very specific dependence 
on (x, y) as they are zero-energy solutions of the Weyl equation (3.11). Meanwhile, their 
dependence on z and t is not yet fixed. We can determine this by plugging the ansatz 
back into the original action (3.10) to find 


We see that the zero modes arising from y4 are all right-movers in the z-direction. 


States in higher Landau levels also have an effective description in terms of two- 
dimensional fermions. Because they have particles of both spins, the states include 
both left- and right-movers. Moreover, the non-zero energy of the Landau level results 
in an effective mass for the 2d fermion, coupling the left-movers to the right-movers. 


We can repeat this story for the chiral fermions ~_. We once again find zero modes, 
but the change in minus sign in the kinetic term (3.10) ensures that they are now 
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left-movers. Putting both together, the low-energy physics of the lowest Landau level 
is governed by the effective action 


S=A f dtdz ix D-x+ + ix D4X- 


where we’ve re-introduced background gauge fields Ag and A, which can still couple 
to these zero modes. However, we’ve seen this action before: it is the action for a two- 
dimensional massless fermion coupled to an electromagnetic field. And, as we’ve seen, 
despite appearances it does not have a conservation law associated to chiral symmetry. 


We computed the violation of axial charge in two dimensions in (3.8). This imme- 
diately translates into the violation of four-dimensional axial charge. We need only 
remember that the lowest Landau level has a degeneracy per area of eB/27, and each 
of these states contributes to the anomaly. The upshot is that, in four dimensions, the 
axial charge changes if we turn on both a magnetic field B and electric field € lying in 
the same direction. 


ba = — <= E.B (3.12) 


This is the chiral anomaly for four-dimensional massless fermions. It is equivalent to 
our earlier, advertised result (3.2). 
3.2 Deriving the Chiral Anomaly 


In the previous section, we’ve seen that the axial charge of a massless fermion is not 
conserved in the presence of background electric and magnetic fields. This lack of 
conservation seems to be in direct contradiction to Noether’s theorem, which states 
that the axial symmetry should result in a conserved charge. What did we miss? 


3.2.1 Noether’s Theorem and Ward Identities 


Let’s first remind ourselves how we prove Noether’s theorem, and how it manifests 
itself in the quantum theory. We start by considering a general theory of a scalar field 
@ with a symmetry; we will later generalise this to a fermion and the axial symmetry 
of interest. 


Noether’s Theorem in Classical Field Theory 


Consider the transformation of a scalar field @ 


õp = eX (¢) (3.13) 
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Here e€ is a constant, infinitesimally small parameter. This transformation is a symmetry 
if the change in the Lagrangian is 


dL =0 


(We can actually be more relaxed than this and allow the Lagrangian to change by a 
total derivative; this won’t change our conclusions below). 


The quick way to prove Noether’s theorem is to allow the constant € to depend on 
spacetime: € = e(x). Now the Lagrangian is no longer invariant, but changes as 


OL OL 


iL = (0,0) ô (eX (o)) n ab eX (9) 
= (16) gay XO) + | poem aX (0) + FE XU) 


But we know that d£ = 0 when e is constant, which means that the term in square 
brackets must vanish. We’re left with the expression 


OL 
6£=(0,6e)c" with J” = X 
The action S = f dx £ then changes as 
ôS = fe ôL = fe (ð e) J" = - f ae EO J” (3.14) 


where we pick e(x) to decay asymptotically so that we can safely discard the surface 
term. 


The expression (3.14) holds for any field configuration @ with the specific change 
ôp. However, when ¢ obeys the classical equations of motion then 6S = 0 for any d¢, 
including the symmetry transformation (3.13) with e(x) a function of spacetime. This 
means that when the equations of motion are satisfied we have the conservation law 


ô, J” =0 
This is Noether’s theorem. 


Ward Identities in Quantum Field Theory 


Let’s now see how this argument plays out in the framework of quantum field theory. 
Our tool of choice is the Euclidean path integral, 


Z|K] = f Do exp (-sia + f d'z Ko) (3.15) 
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where K(x) is a background source for ¢. (This is usually called J(x) but I didn’t 
want to confuse it with the current.) We again consider the symmetry (3.13), this time 
writing it as the transformation 


o— PF =o+e(x)X(9) (3.16) 
We view this as a change of variables in the partition function, which now reads 
Z|K] — [ve exp (-sie + fate Ko!) 
The field in the partition function is nothing more than a dummy variable. This 
means that the new partition function is exactly the same as the original partition 


function (3.15). Nonetheless, we can manipulate this into a useful form. Using the 
transformation (3.16), together with (3.14), and expanding to leading order in €, we 


have 
Z[K] = f D¢' exp (-sta + I d’x Ko) exp (- f d'z e (ô J” — KX) 


x f Dd’ exp (-sta + I d'z Ko) h a / d'z €(0,J" — Kx)| (3.17) 


At this point we need to make a further assumption about the transformation that was 
not needed to derive Noether’s theorem in the classical theory: not only should (3.16) 
be a symmetry of the action, but also a symmetry of the measure. This means that we 
require 


Do = D¢' (3.18) 


Ultimately, this will be the assumption that breaks down for axial transformations. 
But, for now, let’s assume that (3.18) holds and derive the consequences. The first 
term in (3.17) (meaning the “1” in the square brackets) is simply our original partition 
function (3.15). This means that we have 


|v exp (-sia + jes Ko) J d'z e(x) (3 J" — KX) | =0 


But this is true for all e(x). This means that we can lose the integral to leave ourselves 
an expression for each spacetime point, 


[Do ex (-siel + fate Ko) (,J" — K(x) X(¢)) = 0 
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We can now play with the source K to derive various expressions that involve correlation 
functions of „J“ and @. For example, setting K = 0 gives us 


On) = 0 
Alternatively, we can differentiate with respect to K(x’) before setting K = 0 to find 


On(IM(ax) G(x") = lx — 2")(X(¢)) (3.19) 


Differentiating more times gives us the expression 


OAT (x)(x)... P(c) =0 fore x 


while, if x does coincide with one of the insertion points x’ we pick up a term propor- 
tional to 6¢ on the right-hand side as in (3.19). These expressions are collectively known 
as Ward identities. They are sometimes expressed as the operator-valued continuity 
equation 


A, J" =0 


which is to be viewed as saying that 0,.J” vanishes inside any correlation function, as 
long as its position does not coincide with the insertion point of other fields. 


The Axial Symmetry 


We can apply all of the above ideas to the theory that we’re really interested in — a 
massless Dirac fermion in d = 3+ 1 dimensions with action (3.1). For now, we will 
take A, to be a background gauge field, without its own dynamics. As we reviewed 
in the beginning of this section, this theory has both vector and axial symmetry. The 
infinitesimal action of the vector rotation Y — e’*w is 


bp =iew , dp = iey (3.20) 
with the corresponding current 
J” = pyy 
The infinitesimal version of the axial rotation Y > eigh is 
bp = iey , db = iep (3.21) 


Note that now both ~ and w transform in the same way. In Minkowski space, this 
follows from the definition w = Wty°; in Euclidean space Y and w are viewed as inde- 
pendent variables and this is simply the transformation necessary to be a symmetry 
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of the action (3.1). An application of Noether’s theorem as described above gives the 
current 


ja = ipy 
Repeating the rest of the path integral manipulations seems to tell us that the Ward 
identities hold with „j4 = 0. But, as we’ve seen in the previous section, this can’t be 


the case: despite the presence of the axial symmetry (3.21), there are situations where 
the axial charge is not conserved. 


3.2.2 The Anomaly lies in the Measure 


As we mentioned above, in deriving the Ward identities it’s not enough for the action 
to be invariant under a symmetry; the path integral measure must also be invariant. 
This approach to the anomaly is usually called the Fujikawa method. 


For fermions this measure is schematically 


J DYDY (3.22) 
When we change to new variables 
Pavtierd , Y =y ++i (3.23) 


this measure will pick up a Jacobian factor. As we now show, it is this Jacobian that 
gives rise to the anomaly. 


Our first task is to explain what we mean by the field theoretic measure (3.22). To 
do this, let’s consider the Dirac operator J for a spinor in the background of a fixed 
electromagnetic field A,,. This operator will have eigenspinors; these are c-number (i.e. 
not Grassmann-valued) four-component spinors ¢,, satisfying 


We expand a general spinor w in terms of these eigenspinors, 


W(x) = X andn() (3.25) 


where a, are Grassmann-valued numbers. Similarly, we can expand the w in terms of 
eigenspinors 
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As usual, eigenspinors with distinct eigenvalues are orthogonal, and those with the 
same eigenvalues can be chosen to be orthogonal. In the present context, this means 


f diz bndm = Onm (3.26) 


In terms of the eigenspinor expansion, the action reads 
S= f dix ip py = X Anbnan 
In this language, the fermion measure (3.22) is defined to be 


I] f dbndan 


Of course, Grassmann integrations are easy. We have fda = 0 and fda a = 1, with 
similar expressions for b. If we wished to evaluate the Euclidean partition function in 
this language, we would have 


[vive eo = I] f dbpdane” =m òmbmam — I] An = det iD 


This approach hasn’t rescued us from the usual infinities that arise in continuum quan- 
tum field theory: we’re left with an infinite product which will, in general, diverge. To 
make sense of this expression we will have to play the usual regularisation games. We’ll 
see a particular example of this below. 


The Jacobian 


Now that we’ve got a slightly better definition of the fermion measure, we can see how 
it fares under the position-dependent chiral rotation 


by = ie(x)°d 


Such a transformation changes the Grassmann parameters an in our expansion (3.25), 
> an bn = ie(£) X amon 


Using the orthogonality relation (3.26), we have 


ban =Xnmdim with Xam =i | dle ela) Om 
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We want to compute the Jacobian for the transformation from a, to a}, = an + Xnmam. 
Fortunately, the transformation is linear in ap which means that the Jacobian will not 
depend on the value of an. If we were dealing with commuting, c-number objects this 
would be det(1+X). But integration for Grassmann variables is closer to differentiation 
and, for this reason, the Jacobian is actually the inverse determinant. We therefore 
have 


J = det 1 (nm + Xam) 


Because the axial symmetry (3.21) acts on both w and w in the same way, we get the 
same Jacobian for the transformation of bp. This means that we have 


I] f db,da, = |] f db! da! J? 


Before we proceed, it’s worth pausing to point out why the vector and axial transforma- 
tions differ. For the vector transformation (3.20), we have dw = iep and ôy = —iew. 
This extra minus sign means that the Jacobian factors for ~ and ~ have the form 
det ~1(1+ Y) and det ~1(1 — Y) respectively, with Y similar to X but without the 7° 
matrix. This extra minus sign means that the Jacobian vanishes to leading order in €; 
as we will see below, this is sufficient to ensure that it does not contribute to the Ward 
identities. 


Returning to the axial symmetry, we need only evaluate the Jacobian to leading 
order in €; the group structure of the symmetry will do the rest of the work for us. At 
this level, we can write 


J = det "(1+ X) ~ det(1 — X) ~ det eo * =e ™* 


where Tr here means the trace over spinor indices, as well as integration over space. 
Written in full, we have 


J = exp (- [a €(x) duet (3.27) 


Our task is to calculate this. 


Calculating the Jacobian 


We have to be a little careful in evaluating J. To illustrate this, here are two naive, 
non-careful arguments for the value of J: 
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e The first argument says that J = 0. This is because it involves a trace over spinor 
indices and try = 0. 


e The second argument says that J = oo. This is because, at each point x, we’re 
summing over an infinite number of modes n and there is no reason to think 
that this sum converges. 


The truth, of course, is that neither of these arguments is quite right. Instead, they 
play off against each other: when we understand how to regulate the sum, we will see 
why we’re not left with try°. And when we take the resulting trace, we’ll see why the 
sum is not infinite. 


Let’s first worry about the divergence. We want to regulate the sum over modes in 
a manner consistent with gauge invariance. The one useful, gauge invariant, piece of 
information that we have about each mode is its eigenvalue Àn. This motivates us to 
write 


I 
F 


[tEh = tim f ater ea) E bhae 
= lim | dr TODD e-GP g, (3.28) 


where A is a regularisation scale. It has dimension of energy and, as shown above, we 
will ultimately send A —> oo. 


Notice that, already, we can see how we evade our first naive argument. The regulator 
has introduced extra gamma matrix structure into our expression, which means that 
we no longer get to argue that J is proportional to try’ and so necessarily vanishes. 
Instead, the trace over gamma matrices will greatly restrict the form of J. 


In the expression above, we’re taking a sum over states p(x). Such a sum can be 
viewed as a trace of whatever operator O is inserted between these states. But we 
equally well write the trace in any basis. The most familiar is the basis of plane waves 
e**” together with a trace over spinor indices. Implementing this change of basis means 
that we can write 


Daere Moa) = | Seu (pereme) (829) 


where now tr denotes only the trace over spinor indices. 
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(If the step (3.29) seems confusing, it might make you more comfortable to men- 
tion that it’s the kind of manipulation that we do all the time in quantum mechan- 
ics. In that context, we have a basis of states |¢,) with wavefunction n(x). We 
would write 5°, $f,(2)Odn(x) = En (bnlz)(2|Olon) = (xlO|z) = S 3 (k|x)(e|O|k) = 
J Æ e~ Oet. Note, however, that in the present context, the eigenspinors n(x) are 
a basis of fields rather than states in a Hilbert space.) 


The expression (3.29) still looks like it’s difficult to evaluate. But we’ve got two things 
going for us, both descendants of the naive arguments we tried to use previously: 
e The trace tr over spinor indices vanishes when taken over most products of gamma 
matrices. In particular, we have 
try = try yy” =0 
However, if we multiply all five (Euclidean) gamma matrices together we get the 
identity matrix. This is captured by the expression 
tray’ yey? = dete? 
We’ll need this expression shortly. 


e We still want to send A > oo to compute the Jacobian (3.27). Our strategy will 


2 
be to Taylor expand the exponential cP!” But higher powers come with higher 
powers of A in the denominator which, as we will see, will eventually ensure that 
they vanish. 


Let’s now see how this works. First, we need a couple of identities involving the 
covariant derivative. The first is 


2 P: 1 V 1 V 
P“ = y D,D, = aur DD + ree PePy 


1 V 
=D’ + ql". vP D] 


> D? = sT 
The second is 
a = D, + iky 
Combining these, we have 
ek Te p/n elke = eth pD?/M— Sy Fun /A? pika 


= e(Pu tiku)? /A?— LeMay Fu, / A? 


— elPutiku)?/A? o= RWW Fur /N? one (3.30) 
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Here the extra terms in the ... follow from the BCH formula. They do not vanish but, 
as we will see, we will not need them. 


We want to Taylor expand this exponents. In particular, we have 
2 

be FHT Fa /b? 8 (1 _ a Fw 35 _ SP FF oo 34 d (3.31) 
From our arguments above about the spinor traces, we see that only the last of these 
terms contributes. This term scales as 1/A* and we clearly need to compensate for this 
before we take the A — oo in (3.28). Fortunately, this compensation comes courtesy 
of the f dtk which will give the A* term that we need. (You may want to first shift 
ky, — ky, + A, (x) to absorb the potential in the covariant derivative.) 


There will also be other terms in the expansion (3.31) which are non-zero after the 
trace. There will also be further terms from the BCH contributions in (3.30). However, 
all of these will scale with some power 1/A” with n > 4 and so will vanish when we take 
the A > oo limit. A similar argument holds for the ec? /” terms in the first exponent 
in (3.30). We end up with 


7 dk ; ae 
So on Von = jim f (2 4 tr (reeset? a) 
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This is what we need. 


The Anomalous Ward Identity 


Let’s put these pieces together. We’ve learned that under a chiral transformation dy = 
ie(x)y°?w, the fermion measure picks up a Jacobian factor (3.27) which is calculated in 
(3.32). The transformation dy = ie(x)Wy° gives us another factor of this Jacobian so, 
in total, the measure transforms as 

ie? 


f DyDyy — f DYDy exp Ee f dfx e(a) Fu Fe (3.33) 


It is a simple matter to follow the fate of this term when deriving the Ward identities 
described in Section 3.2.1. We find that the current j4 = iWy"7°w associated to axial 
transformations is no longer conserved: instead it obeys 


2 
Tega” Fv Foe (3.34) 


This is our promised result (3.2) for the chiral anomaly. 


njh — 
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We saw in Section 1.2 that the right-hand side of (3.34) is itself a total derivative, 
eP Fwy Foo = 40 (7 Ay O,Ac) 


It’s tempting to attempt to define a new conserved current that is, roughly, 74 —*AdA. 
But this is illegal because it’s not gauge invariant. Hopefully our discussion in Sections 
3.1.1 and 3.1.2 has already convinced you that there’s no escaping the anomaly: it is 
real physical effect. 


There are a number of straightforward generalisations of this result. First, if we have 
N; massless Dirac fermions, then the anomaly becomes 


Alternatively, we could return to a single Dirac fermion, but give it a mass m. This 
explicitly breaks the axial symmetry. Nonetheless, the anomaly remains and the diver- 
gence of the axial current is now given by 


2 
: PEER e vpo 
O54 = —2imbyrw + ee eee es og 


For the purpose of our discussion above, we took the fermions to be dynamical (in 
the sense that we integrated over them in the path integral), while the gauge field A,, 
took some fixed, background value. However, nothing stops us promoting the gauge 
field to also be dynamical, in which case we are discussing QED. The calculation above 
goes through without a hitch, and the result (3.34) still holds. 


With dynamical gauge fields, one might wonder if there are extra corrections to the 
chiral anomaly. In fact, this is not the case. For deep reasons, the result (3.34) is exact; 
it receives neither perturbative nor non-perturbative corrections. We will start to get 
a sense of why this is in Section 3.3.1. 


The Anomaly in Non-Abelian Gauge Theories 


It is a simple matter to adapt the above arguments to non-Abelian gauge theories. For 
example, we may have a Dirac fermion transforming in some representation R of a 
non-Abelian gauge group, with field strength F’,,. The Lagrangian for the fermion is 


L= ipy” (ð, — 1A, )y 


The calculation that we did above goes through essentially unchanged; we need only 
include a trace over the colour indices. We now have 


O44 = mae tte Pa P oe (3.35) 
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Note that the overall factor of e? has disappeared because we are here working in the 
conventions described in Section 2.1.1 in which the coupling constant sits as an overall 
factor in the action. 


The Anomaly in Two Dimensions 


It is also a simple matter to adapt the above arguments for fermions in d = 1+ 1 
dimensions (or, indeed, for fermions in any even number of spacetime dimensions). 
Now the gamma matrices are 2 x 2 and, in Euclidean space, we have 


tr apy’ = iet” 


which means that the term linear in F in (3.31) is now non-vanishing. The factor 
1/A? is compensated by the divergent factor coming from the f d?k integral. Repeating 
the derivation above, we this time find 


. €e 
Onda = For (3.36) 


This agrees with our earlier, heuristic derivation (3.8). Note that, in d = 1 + 1, one 
only gets an anomaly for Abelian gauge groups. Attempting to repeat the calculation 
for, say, SU(N) would give tr Fo; = 0 on the right-hand side. 


3.2.3 Triangle Diagrams 


There are many different approaches to computing the anomaly. The path integral 
approach that we saw above is arguably the most useful for our purposes. But it is 
worthwhile to see how the anomaly arises in other contexts. In this section, we see how 
the anomaly appears in perturbation theory. Indeed, this is how the anomaly was first 
discovered. 


We will start by considering a free, massless Dirac fermion, 
S= / dtz id dy 


The essence of the argument is as follows. We will look at a certain class of one- 
loop Feynman diagrams known as “triangle diagrams”. These are special because they 
involve both U(1)y current j” = wy"w and the U(1), current jh = Wy"7°w. Even in 
our free theory, these triangle diagrams are UV divergent and need regulating. The crux 
of the argument is that any regulation necessarily violates either the U(1)y symmetry or 
the U(1)4 symmetry; these is no way to make sense of the triangle diagram preserving 
both symmetries. As we remove the regulator, its memory lingers through the loss of 
one of these symmetries. This is the anomaly. 
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Let’s now see this in detail. We focus on the three-point correlator containing two 
vector currents and a single axial current: 


THP (x1, £2, 03) = (OT (J" (z1) J” (z2) Fa (#3))I0) 


where, as usual, T denotes time-ordering, for Minkowski space correlators. In Euclidean 
space, no such ordering is necessary. 


With hindsight, it is possible to see why we should look at this particular correlator 
because the anomaly equation (3.34) includes a single axial current j4 and two gauge 
fields, each of which couples to the vector current j. 


It is simplest to work in momentum space. The Fourier transform is 
f d a cd sg TP penge T r — T yo, q) 6°(pi + po + 4) 


where we're using the notation that the function and its Fourier transform are dis- 
tinguished only by the arguments. The delta-function on the right-hand side arises 
because our theory is translational invariant. Tracing their origin, we note that the 
momenta pı and pə refer to the vector current, while q refers to the axial current. 


Before we explore the anomaly, let’s first see what we would naively expect the 
conservation of currents to imply for [“”?(p1, po, q). Consider 


f O x f . 
Dipl *”? (pi, po, q) = -i | dad can THP (x1, £2, 3) pe eee 
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But this is the kind of expression that we computed in Section 3.2.1. The Ward identity 
tells us that „j“ = 0 holds as an operator equation. There is a delta-function, contact 
term that arises when x] = £s or zı = £3 — this can be seen on the right-hand side of 
(3.19) — but it vanishes in this case because neither of the currents j” nor j4 transforms 
under the symmetry. (The fact that j” does not transform is the statement that the 
symmetry is Abelian). The result is that the Ward identity for the conserved vector 
current takes a particularly simple form in momentum space, 


Pipl“? (pi, pa, q) = 0 (3.37) 


and, equivalently, 


Pal? (p1, p2,q) = 0 
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Meanwhile, we can run exactly the same argument for the conservation of the axial 
symmetry to find 


dol *”?(p1,p2,9)=0 & —(Pip + P2)? (p1, p2,q) =0 (3.38) 


where the equivalence of these expressions comes from 4-momentum conservation: pı + 
p2 +q = 0. (Note that a different index is contracted so this final expression does not 
follow from the previous two.) As we will now see, the anomaly means that things 
aren’t quite this simple. 


Triangle Diagrams 


The leading order contribution to our three-point function comes from one-loop triangle 


diagrams, 

<— EN 

p 
k i k i 

. q q 
—iT#”? (py, po, q) = 4. k+p, + > k+p, (3.39) 
= k- 

k-q P q Pi 
a << 


In terms of equations, these diagrams read 


. dtk i i i Pı © po 
3 PHYP ae Pro v H 
iT (pı, p2, q) ic tr Ka KE-4” E+ | + ( pSr 


where the overall minus sign comes from Wick contracting the fermions and the trace 


is over the gamma matrix structure. 


We will check all three of the Ward identities above. We start with the one we are 
most nervous about: (3.38). This now reads 


; uvp a dtk 1 5 1 m 1 ah a Pı © pe 
aonad =i f g E A a? Pa 22) 


To proceed, we use the identity 


dP? =-7d=VKE-O+hY 


to find 


—iqg T" (pı, p2, 9) = if ca tr f (7 d) + k) - ! i | (" ia 
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We’re left with four terms. We gather them like this: 


—iq, T"? (pi, po, q) = At” + AS” 


where 
mw; dtk r 1 5 y 1 j 5 1 i 1 v 
A= o a CE RG? zi 
al d'k efir : yh — : ryz 
(27)4 Lf K+ p, k+ py k—¢q 
and 


dtk 1 1 1 1 
AY =i f aa 4° ae a ag r| 
Í (27)4 k-4 +p fi K+ py 
d'k | OE TE: 1 1 
=i j= y” year fi 
Jor K+ py kK-g E k + po 
where in each case we go to the second line by using the cyclicity of the trace and the 
fact that {4”, %5} = 0. The advantage of collecting the terms in this way is that it 
naively looks as if both AĮ” and A$” cancel. For example, in A{”, all we need to do 
is shift the integration variable in the first term from k to k + pə. Using momentum 
conservation pı + po = —q, we see that the two terms then cancel. Something similar 


happens for AS”. Taken at face value, it looks like we’ve succeeded in showing the 
Ward identity (3.38). Right? Well, no. 


The problem with this argument is that all the integrals above are divergent. Indeed, 
all the terms in A; and A, have two powers of k in the numerator, yet we integrate 
over dtk, suggesting that they diverge quadratically. In fact, as we’ll see below, the 
gamma-matrix structure means that the divergence is actually linear. When dealing 
with such objects we need to be more careful. 


There are a number of ways to deal with these differences of divergent integrals. Here 
we'll pick a particular path. Consider the general integral of the form 


- f d&k 
A -if oa Lf) — f(k +a) (3.40) 
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where f(k) is such that each individual integral f dk f(k) is linearly divergent. If we 
Taylor expand for small a, we have 


A= if oe [arð TETT EN n 
ee o E Ga 
Each term above is a boundary term. Moreover, each term in the expansion is less and 
less divergent. If the original integral is only linearly divergent we need keep only the 
first of these terms. We have 


ta s] demi 
A= if (ony! a” |k|° f(k) (3.41) 


where the integral is taken over the boundary S° at |k| — 00. We’ll now look at what 
this surface integral gives us for our triangle diagram. 
An Ambiguity in the Integrals 


To proceed, let’s first go back to the beginning and allow a general offset, 8#, between 
the momenta that run in the two loops. We then replace (3.39) with 


P P. 
k ! k+p 3 
; q q 
—ir (p, Do, q) = k+p, + > & k+p,+B 
= k-q+ 
k-q P, qtB P, 


<= = 


We will first find that the final answer is depends on this arbitrary parameter 6. We 
will then see how to resolve the ambiguity. 


Following our manipulations above, we write this as 
—iggl"”? (pı, p2, q) — Av T AP (3.42) 


where 


Kwi d*k r 1 5v 1 ji 1 by 1 i 
ay =i f yit = "en ar mr G 
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Each of these is of the form (3.40). For the A“”, we have a difference of two divergent 
integrals, with integrand 


a f = Pa a ~ Rk N tr [PE pD 
We now use the gamma matrix identity 
tr (YPY) = Air 
to write 


f(b) = —aierone EERDE C gjerm PE 

k?(k + pi)? k?(k + pi)? 

In the second equality, we’ve used the anti-symmetry of the epsilon tensor to remove 
the k?k? term. This is why — as advertised above — our integrals are actually linearly 
divergent rather than quadratically divergent. We can now simply apply the result 
(3.41) to the cases of interest. For the integral A’”, the off-set is given by a = 8 + po, 
and we have 


Aw = 4 | tk wowote + ps) Pipher — 
1" Isa, Cr) PRIMO + pay? 


3 
To perform the integration over S3, we use 
A 1 
f dk*k? = — 8% Vol(S?) 
s3 4 
with Vol(S*) = 27°. We find 
Ae = 1 wpe (B + pe) 


We can go through the same steps to evaluate Af” in (3.44). This time we have the 
off-set a = pı — p and find 


A LV 1 vpo 
AS’ = tee °"”n2,(p1 — Bo 


The Ward identity for the axial symmetry (3.42) then becomes 


° V 1 vpo 
=g P = i á |2PpP2o + (p1 + p2) po (3.45) 


As we suspected, this depends on our arbitrary 4-momentum 8. The question is: how 
do we fix 8? 
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Resolving the Ambiguity 


The answer comes by looking at the Ward identity (3.37) for the vector symmetry. It 
turns out that this too depends on 8. Indeed, we have 


ee ae de (Lope 1 ny 1 E 1o, 
ip, I ioe Lz CEA Rept TY pg" ag, 


Playing the same kind of games that we saw above, we have an anomalous Ward identity 


for the vector current 
—ip, PS 1 vl pl =p) 
Pip = Br Pip P2)o 


Similarly, the other vector Ward identity reads 


, V 1 Vo 
ple" = 32° Po (B+ pı)o 


We learn that all three Ward identities depend on the arbitrary 4-momentum 8. This 
provides the clue that we need in order to determine 3. Suppose that we wish to insist 
that the vector current survives quantisation. Indeed, this must be the case if we wish 
to couple this to a background gauge field. In this case, we must choose a p such that 
the two vector Ward identities are non-anomalous. For this, we must have 


B-par~p, and B+pi~p2 > =p -p 
With this choice 
—ip, I"? = -ipp = 0 
while the axial Ward identity (3.45) becomes 
1 
—iqg I"? = “53 P1pP20 (3.46) 


This is the anomaly for the free fermion. 


Our discussion above looks rather different from the path integral approach of Section 
3.2.2. We see that we have an arbitrary parameter Ø which allows us to shift the 
anomaly between the axial and vector currents. Why did we miss this before? The 
reason is that we chose a specific regulator — first introduced in (3.28) — which was 
gauge invariant. By construction, this ensures that the vector symmetry is preserved 
at the expense of the axial symmetry. 
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More generally, different regulators will violate some linear combination of the sym- 
metry. Usually, it is the axial symmetry which suffers. For example, if we use Pauli- 
Villars, we should need to introduce a massive fermion and the mass term explicitly 
breaks the axial symmetry. 


Including Gauge Fields 
So far, the anomaly in momentum space (3.46) looks rather different from our original 
version (3.34) 


e2 


Onda = Tg” Fw Foo (3.47) 


However, they are actually the same formula in disguise. To see this, we couple the 


vector current j” = yy" to a U(1) gauge field A,,, so the fermions are now described 
by the action 


g= jae ipy” (ð, — ieA,)w (3.48) 


For the purposes of our discussion, A, could be either a fixed, background field or, 
alternatively, a dynamical gauge field. From our previous definitions we have 


ge = f dad sad ns (OT "7" 074) ere eas 


where we’ve omitted the delta-function 6°(p; + po + q) from the left-hand-side, as well 
as various arguments. Using the chiral anomaly in the form (3.47), we can write 


2 
sh BU : e OAT H pv 
(OTG 3” 3pj4)10) = z OTG J” Ope 8x410) 


2 
e OAT : V 5 
= ga” `T (0| j” 0,A,|0) (0| j” O,A,|0) + permutation 
But the two-point function of the current and gauge field can be read off from the 
Feynman rules for the action (3.48) 


e(0|j”(x1)Ao(z3)|0) = —ið",8 (x1 — x3) 
A little algebra then allows us to reproduce the anomaly in momentum space, 
7 l ase 
—iq I" C= -d . PipP20 


As we mentioned in Section 3.2.2, when the gauge fields are dynamical one might worry 
about higher order corrections to the anomaly. It turns out that these don’t arise. This 
was first proven by Adler and Bardeen by explicit analysis of the higher-loop Feynman 
diagrams. We will give a more modern, topological viewpoint on this in Section 3.3.1. 
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3.2.4 Chiral Anomalies and Gravity 


There is a second, related contribution to the axial anomaly. This doesn’t arise when 
the theory is coupled to background electric fields, but instead when the theory is 
coupled to curved spacetime. As before, this effect arises either for quantum field 
theory in a fixed, background spacetime, or for quantum field theory coupled to gravity 
which, of course, means dynamical spacetime. 


Let’s first review how to couple spinors to a curved spacetime. The starting point is 
to decompose the metric in terms of vierbeins, 


Iu(2) = ep (2) e, (x) 


There is an arbitrariness in our choice of vierbein, and this arbitrariness introduces an 
SO(3,1) gauge symmetry into the game. The associated gauge field war is called the 
spin connection. It is determined by the requirement that the vierbeins are covariantly 
constant 


a — app pa ar boyss 
Die, = pey = Tire Ewu = O 


where If, are the usual Christoffel symbols. This language makes general relativity 
look very much like any other gauge theory. In particular, the field strength of the spin 
connection is 


(Rur) = pwyo — Oper a [wus wr], 


T 
o’ 


is related to the usual Riemann tensor by (Ry,)*, = e563 Ru 


This machinery is just what we need to couple a Dirac spinor to a background curved 
spacetime. The appropriate covariant derivative is 


1 a 
Dya = Ipha + wp (Seb) ats 
where Sab = Ea, | is the generator of the Lorentz group in the spinor representation. 


Written in this way, the coupling spinors to a curved spacetime looks very similar 
to the coupling to electromagnetic fields. It is not surprising, therefore, that there is 
a gravitational contribution to the anomaly. The kind of manipulations we performed 
previously now give 


Duda a ver Rohs (3.49) 
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3.3 Fermi Zero Modes 


The anomaly was first discovered in the early 1970s in an attempt to make sense of the 
observed decay rate of the neutral pion to a pair of photons. We will tell this story in 
Section 5.4.3 where we describe some aspects of the spectrum of QCD. 


Here, instead, we focus on ways in which the anomaly fits into our general under- 
standing of fermions coupled to gauge fields. 


3.3.1 The Atiyah-Singer Index Theorem 


The anomaly has a rather nice mathematical interpretation: it is a manifestation of 
the famous Atiyah-Singer index theorem. 


Consider again the Dirac operator in Euclidean space in the background of a general 
gauge field A,. The operator iP is Hermitian and so has real eigenvalues. 


with Àn € R. Whenever we have an eigenfunction ¢, with A, Æ 0 then 7°¢, is also an 
eigenfunction. This follows because q4? = —7°7" for u = 1,2,3,4 so 


We see that all non-zero eigenvalues come in +), pairs. Moreover, ¢, and y°¢, must 
be orthogonal functions. Evidently, the eigenfunctions with Àn # 0 cannot also be 
eigenfunctions of 7°. 


However, the zero eigenvalues are special because the argument above no longer 
works. The corresponding eigenfunctions are called zero modes. Now, it may well be 
that ¢, and y°¢, are actually the same functions. More generally, for the zero modes 
we can simultaneously diagonalise i) and y° (because both ¢, and y°¢, have the same 
i eigenvalue, namely zero). Since (4)? = 1, the possible eigenvalues of y° are +1. 
We the define n} and n_ to be the number of zero modes of iD with y° eigenvalue +1 
and —1 respectively. The total number of zero modes is obviously n} + n_. The index 
of the Dirac operator is defined to be 


Index(iP) = n} — n 


But we have actually computed this index as part of our derivation of the anomaly 
above! To see this, consider again the result (3.32) 


2 
T e vpo 
S Pn Pn = zae P pk po 
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This is rather formal since, in R* there will be a continuum of eigenvalues labelled by 
the index n. However, we can always compactify the theory on your favourite four- 
manifold and the spectrum will become discrete. If we then integrate this equation 


2 


A e vpo 
jae So Gn Pn = 23972 jes él P Fimke 


Then we note that only the zero modes contribute to the left-hand side. This is because, 
as we saw above, whenever \,, Æ 0 then ¢, and y°¢, are orthogonal functions. This 
means that the left-hand-side is the index that we want to compute 


jes So uT Oa = pes >. Pn Pn =n} >n- 


zero modes 


We get our final result 


e2 


327 


Index(iP) = fes eP F w Foo 

This is the Atiyah-Singer index theorem. Mathematicians usually state this in units 
where e = 1. Note that the right-hand side is exactly the quantity that we showed to 
be an integer in Section 1.2.4 when considering the theta angle in Maxwell theory. 


The connection to the index theorem is our first hint that there is something deep 
about the anomaly. To illustrate this in physical terms, consider our theory on the space 
R x X, where X is a closed spatial 3-manifold. We define the axial charge Q4 = fy 74. 
We also parameterise R by t (think “time” even though we’re in Euclidean space). 
Then the integrated anomaly equation tells us the change in the charge, 


2 
= dis — er p Fo (3.51) 


AQa = Qal Qa T 


t=+ 
The left-hand side is an integer because of quantum mechanics. Meanwhile, the right- 
hand side is an integer because of topology. The anomaly equation relates these two 
ideas. 


This connection to topology also explains why the anomaly equation (3.34) (or, for 
non-Abelian gauge theories, (3.35)) is exact, and does not get corrected at higher order 
in perturbation theory. It is simply because the right-hand side of (3.51) is an integer 
and any corrections — say, at order ef — would change this. 


= 150 = 


3.3.2 Instantons Revisited 


The anomaly tells us that, in spite of classical appearances, U(1),4 is not really a 
symmetry of our theory. This, in turn, means that the axial charge is not conserved. 
The result (3.51) tells us that we expect to see violation of this charge when f dtz F*F 
is non-zero. This tallies with the picture we built up in Section 3.1.2, where we needed 
to turn on constant background electric and magnetic fields to see that the axial charge 
is not conserved. 


At this point, there is an important difference between Abelian and non-Abelian 
theories. This arises because non-Abelian theories have finite action configurations with 
J d‘x F*F #0. Among these are the classical instanton solutions that we described 
in Section 2.3. This means that the path integral about the vacuum state will include 
configurations which give rise to the violation of axial charge. 


In contrast, Abelian theories have no finite action configurations which change the 
axial charge; such a process will not happen dynamically about the vacuum, but must 
be induced by turning on background fields as in Section 3.1.2. (This is true at least 
on R4; the situation changes on compact manifolds and the Abelian theories are closer 
in spirit to their non-Abelian counterparts.) 


It’s worth understanding in more detail how instantons can give rise to violation of 
axial charge. Let’s start by revisiting the calculation of Section 2.3, where we showed 
that instantons provide a semi-classical mechanism to tunnel between the |n) vacua of 
Yang-Mills. The end result of that calculation was that the true physical ground states 
of Yang-Mills are given by the theta vacua (2.43) 


0) = So en) 


Now what happens if we have a massless fermion in the game? As we’ve seen above, in 
the background of an instanton a massless quark will have a zero mode. Performing the 
path integral over the fermion fields then gives the amplitude for tunnelling between 
two |n) ground states. Schematically, we have 


(nln +v) ~ [ papwve exp (- fae gat PO Fy + iy dv) 


1 
~ [PA det(i D) exp (- faz TP Fy) 


Previously, this amplitude received a non-vanishing contribution from instantons with 
winding number v. Now, however, the fermion has a zero mode in any such configu- 
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ration. This means that det(i D) = 0. We see that the presence of a massless fermion 
suppresses the vacuum tunnelling of Section 2.3. 


While instantons no longer give rise to vacuum tunnelling, they do still have a role 
to play for, as we anticipated above, they now violate axial charge. To see how this 
happens, let’s tease apart the calculation above. Following (3.25), we expand our 
fermion fields in terms of eigenspinors ¢, and dn, 

V(x) = S| anbn(2) and (z) = S bufale) 


where a,, and b, are Grassmann-valued numbers and the eigenspinors obey 


The action for the fermions is 
S= / dtz ippo = X Anbnan 


A fermion zero mode is an eigenspinor — which we will denote as ¢9 — with Ay = 0. 
This means that the corresponding Grassmann parameters ao and bọ do not appear in 
the action. When we compute the fermionic path integral, we have 


[ori exp (- fae iw po) = [I f dodh, exp (= aban) 
-T] i dayilby T] (1 + Ambmam) 


But Grasmmann integrals are particularly easy: they’re either zero or one, with f da = 
0 and fda a= 1. The integration above vanishes whenever there is a fermi zero mode 
because there’s nothing to soak up the integration over the associated Grassmann 
variables ag and bo. This is why massless fermions cause the instanton tunnelling 
amplitude to vanish. 


We learn that we’re only going to get a non-vanishing answer from instantons if we 
compute a correlation function that includes the fermion zero mode. This leads to a 
rather pretty superselection rule. Consider the correlation function 


(p+) 


This is known as a chiral condensate. This has axial charge +2. If U(1), is a good, 
unbroken symmetry of our theory then we would expect this to vanish in the vacuum. 
However, we know that U(1),4 is, instead, anomalous. We will now see that this is 
reflected in a non-vanishing expectation value for the chiral condensate. 
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Written in terms of our eigenbasis, the chiral condensate becomes 
7 1 -L 5 
yoy = 5 2 biara +?) bv 


where we’re using the fact that y°w, = Y, to write w4 as a projection of Y onto the 
+1 eigenvalue of 7°. We can then write the correlation function as 


_ _ I L 
Ul f danda T] (1+ Anbnam) 5 Y harði +r 


1 1- ; 
= (u 3 5 (= zeat? a) (3.52) 


We can look at the contributions to this from each instanton sector, v. When we’re in 


(p+) 


the trivial, v = 0, sector there are generically no zero modes so the product ] [„ An 4 0. 
(One might wonder whether perhaps n} = n- # 0. This is possible, but generically 
will not be the case.) However, as we saw in (3.50), the eigenvalues A,, come in + 
pairs, a fact which follows from the existence of 7°. This means that the sum over A 
will contain equal and opposite contributions, and the contribution from the trivial 
instanton sector is (YỌ) =o = 0. 


In contrast, interesting things happen when we have winding v = 1. Now there is a 
single zero mode which obeys 7°¢ 9 = +¢@9. But the multiplication by ào in the product 
is precisely cancelled by the ¢o¢) term in the sum. We see that, in this semi-classical 
approximation, 


(b_W4)va1 = det (iD) dogo 
where det’ means that you multiply over all eigenvalues, but omit the zero modes. 


In fact, this is the only topological sector that contributes to (w_w,). When v = —1, 
we also have a zero mode but it has opposite chirality, y°¢9 = —d9, and so does not 
contribute. Instead, this sector will contribute to (¢4~_). 


Meanwhile, when |v| > 2, we have more than one zero mode and the integral 
(3.52) again vanishes. Instead, these sectors will contribute to correlators of the form 


(b-p). 
3.3.3 The Theta Term Revisited 


We saw above that the existence of massless fermions — and, in particular, their fermi 
zero modes — quashes the tunnelling between |n} vacua. This leaves us with a question: 
what becomes of the theta angle? 
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The answer to this is hiding within our path integral derivation of the anomaly. 
Consider a single Dirac fermion coupled to a gauge field (either Abelian or non-Abelian, 
it doesn’t matter) and make a chiral rotation (3.21). On left- and right-handed spinors, 
this acts as 


py > ey, and poe (3.53) 


The upshot of our long calculation in Section 3.2.2 is that the measure transforms as 
(3.33), 


= > i?a ee 
[ deve —> [oiri exp E- jes ctv PF) 
But this is something that we’ve seen before: it is the theta-term that we introduced 


for Maxwell theory in Section 1.2 and for Yang-Mills in Section 2.2! We see that a 
chiral rotation (3.53) effectively shifts the theta-angle by 


0—0 -2a (3.54) 


This means that the theta angle isn’t really physical: it can be absorbed by changing 
the phase of the fermion. 


(There is a caveat here: the mass for a single fermion might undergo additive renor- 
malisation that shifts it away from zero. So it’s not quite right to say that the theta 
angle ceases to exist when m = 0. Rather, we should say that for m € R, there is a sin- 
gle value where the theta-angle becomes unphysical. Note that this issue doesn’t arise 
if multiple fermions become massless because then we get an enhanced chiral symmetry 
which prohibits an additive mass renormalisation.) 


This ties in with our discussion of instantons in the previous section. We saw that 
the chiral condensate (Y-Y) receives a contribution only from topological sectors with 
winding v = 1. If we added a theta term in the action, we would find (Y-Y) ~ e”, 


since e” is the sign of a single instanton. This agrees with our result (3.54). 


The discussion above shows that the parameter 0 can be absorbed into a dynamical 
field, which is the phase of the fermion. But we can also turn this idea on its head. 
Suppose that we hadn’t realised that U(1) 4 was anomalous, but we knew that (¢_,) 4 
0. We might be tempted to conclude that this condensate has broken a global symmetry 
and would be entitled to expect the existence of an associated Goldstone boson, which 
is the phase of the condensate. Yet no such Goldstone boson exists. One can view 
the would-be Goldstone boson as 6, but it is a parameter of the theory, rather than a 
dynamical field! 
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With more than one massless fermion, there are also fermionic condensates that 
break the non-anomalous part of the chiral flavour symmetry. These are not due to 
instantons and, this time, we do get Goldstone modes. Their story is interesting enough 
that it gets its own chapter: it will be told in Section 5. 


So far we have focussed on massless fermions. What happens for a massive fermion? 
Does the 0 angle suddenly become active again? Well, sort of. For a Dirac fermion, 
we have two choices of mass term: either yy or iwy>w. Only the former is invariant 
under parity, but both are allowed. Written in terms of the Weyl fermions, these two 
mass parameters naturally split into a modulus and complex phase, 


Lmass =m (etha + eat py ) 


However, the anomaly means that we can trade the phase ¢ for a theta angle, or vice- 
versa. Only the linear combination 0 + @ has physical meaning. More generally, with 
Ny fermions we can have a complex mass matrix M and the quantity 0 + arg (det M) 
remains invariant under chiral rotations. 


The Witten Effect Revisited 


We spent quite a lot of time in earlier sections understanding how the theta angle is 
physical. Now we have to return to these arguments to understand why they fail in the 
presence of massless fermions. For example, in Section 1.2.3 we discussed the Witten 
effect, in which a magnetic monopole picks up an electric charge proportional to 0. 
What happens in the presence of a massless fermion? 


The answer to this question is a little more subtle. For fermions of mass m, one finds 
that the fermions form a condensate around the monopole of size ~ 1/m and, in the 
presence of a theta angle, this condensate carries an electric charge that is proportional 
to 0 as expected by the Witten effect. As the mass m — 0, this electric charge spreads 
out into an increasingly diffuse cloud until, in the massless limit, it is no longer possible 
to attribute it to the monopole. 


3.3.4 Topological Insulators Revisited 


The ideas above also give us a different perspective on the topological insulator that 
we met in Section 1.2.1. Consider a Dirac fermion in d = 3+ 1 dimensions, whose mass 
varies as a function of one direction, say x? = z. We couple this fermion to a U(1) 
gauge field, so the action is 


S= f d'e Dy -mei 
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We take the profile of the mass to take the form shown me 
in the figure. In particular, we have 


+m as z — œ 
me) =d 


=m as Z —> —00O 


with m > 0. If we perform a chiral rotation only 
in the region z < 0, we can make the mass positive : 
; i ; Figure 28: 
again, but only at the expense of introducing a non- 
trivial 0 = m. In other words, the massive fermion 
above provides a microscopic realisation of the topological insulator. Note that the mass 
term ww is compatible with time reversal invariance as expected from the topological 


insulator. (In contrast, a mass term wy°q breaks time reversal.) 


This set-up also brings something new. Let’s turn off the gauge fields and study the 
Dirac equation. Using the gamma matrices (3.9), the Dirac equation is 


iOop_ + io'ðip- = my 
10o4 — io Ob, = map_ (3.55) 


Solutions to these equations include excitations propagating in the asymptotic |z| — oo 
region, but these all cost energy E > m. However, there can be solutions with energy 
E < m that are bound to the region z ~ 0. In general, the number of such bound 
states will depend on the properties of m(z). But there is one special solution that 
always exists, providing the profile obeys (3.55). This is given by the ansatz 


teh. =x (- fe me) E 


Note that this ansatz is localised around z œ~ 0, dropping off exponentially as e~”™!?! 
as z — +00. It has the property that the ð, variation in (3.55) cancels the m(z) 


dependence, leaving us with the 2-component spinor y(x) which must satisfy 
Oox+a'Ayx + o7O.x = 0 
But this is the Dirac equation for a massless spinor in d = 2+ 1 dimensions. This is a 


Fermi zero mode, similar in spirit to those that we saw above associated to instantons. 
In the present context, such zero modes were first discovered by Jackiw and Rebbi. 
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We learn that, in this realisation, the boundary of 


Fermi level 


the topological insulator houses a single gapless fermion. 


Bulk conduction 
band (BCB) 


Indeed, these surface states can be observed in ARPES ex- 


BCB bttom 


t 


Band gap 


periments and have become the poster boy for topological 


Surface state! 
band (SSB) 


insulators. An example is shown on the right, beautifully 


— Dirac point 


Binding energy (eV) 


revealing the relativistic Æ = |k| dispersion relation. i 
band (BVB) 


Note that the surface of the topological insulator only 
houses a single, 3d Dirac fermion. The other putative zero 


mode would come from Y} = —io%w_ but this solves the 
equations of motion only if Y} ~ exp (+ f dz m(z)), and Figure 29: 
this is not normalisable. 


There is an important technicality in the above story. As we have stressed, the 
topological insulator preserves time-reversal invariance. Yet it turns out that a single 
Dirac fermion in d = 2+1 dimensions does not preserve time-reversal. (We will discuss 
this in some detail in Section 8.5.) However, as the topological insulator shows, it is 
possible for time-reversal invariance to be preserved providing that the 3d fermion is 
housed as part of a larger 4d world. This is an example of a more general mechanism 
called anomaly inflow that will be described in Section 4.4.1. 


3.4 Gauge Anomalies 


The chiral anomaly of section 3.1 is an anomaly in a global symmetry: the naive 
conservation law of axial charge is violated in the quantum theory in the presence of 
gauge fields coupled to the vector current. Such anomalies in global symmetries are 
interesting: as we’ve seen, they are closely related to ideas of topology in gauge theory, 
and give rise to novel physical effects. (We will see the effect of the anomaly on pion 
decay in Section 5.4.3.) 


In this section, we will focus on anomalies in gauge symmetries. While anomalies 
in global symmetries are physically interesting, anomalies in gauge symmetries kill 
all physics completely: they render the theory mathematically inconsistent! This is 
because “gauge symmetries” are not really symmetries at all, but redundancies in our 
description of the theory. Moreover, as we sketched in Section 2.1.2, these redundancies 
are necessary to make sense of the theory. An anomaly in gauge symmetry removes 
this redundancy. If we wish to build a consistent theory, then we must ensure that all 
gauge anomalies vanish. 
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There is a straightforward way to ensure that gauge symmetries are non-anomalous: 
only work with Dirac fermions, and with gauge fields which are coupled in the same 
manner to both left- and right-handed fermions. Such theories are called vector-like. 
Nothing bad happens. 


Here we will be interested in a more subtle class of theories, in which left- and right- 
handed fermions are coupled differently to gauge fields. These are called chiral gauge 
theories and we have to work harder to ensure that they are consistent. Note that 
chiral gauge theories are necessarily coupled to only massless fermions. This is because 
a mass term requires both left- and right-handed Weyl fermions and is gauge invariant 
only if they transform in the same way under the gauge group. In other words, mass 
terms are only possible for vector-like mater. 


We describe how to build chiral gauge theories with U(1) gauge groups in section 
3.4.1, with non-Abelian gauge groups in section 3.4.2 and with SU(2) gauge groups 
(which turns out to be special) in section 3.4.3. 


3.4.1 Abelian Chiral Gauge Theories 


Here is an example of a bad theory: take a Dirac fermion and try to gauge both 
axial and vector symmetries. We know from our discussion in Section 3.1 that some 
combination of these will necessarily be anomalous. 


Equivalently, we could consider a single U(1) gauge theory coupled to just a single 
Weyl fermion, either left- or right-handed. This too will be anomalous, and therefore 
a sick theory. 


So how can we construct a chiral gauge theory with a single U(1) gauge field? We 
will have Nz left-handed Weyl fermions with charges QF € Z and Np right-handed 
Weyl fermions with charges Q? € Z. To ensure that the triangle diagram vanishes, we 
require 


Nz Nr 


> 10 => dT (3.56) 


a=1 j=l 


There are obvious solutions to this equation with Nz = Nr and Q} = QF. These are the 
vector-like theories. Here we are interested in the less-obvious solutions, corresponding 
to chiral theories. We will assume that we have removed all vector-like matter, so that 
the left-handed and right-handed fermions have no charges in common. 
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We can simplify (3.56) a little. In d = 3+ 1 dimensions, Q 
the anti-particle of a right-handed fermion is left-handed: This 


means that we can always work with a set of purely left-handed Q 
fermions which have charges Qa = {Q?, —QF}. The require- 
ment of anomaly cancellation is then 
Figure 30: 


N 
y Osi (3.57) 
a=1 
We would like to understand the possible solutions to this equation. In particular, what 
is the simplest set of charges that satisfies this? 


Clearly for N = 2 fermions, the charges must come in a + pair which is a vector-like 
theory. So let’s look at N = 3. We must have two positive charges and one negative (or 
the other way round). Set Qa = (x,y, —z) with x,y, z positive integers. The condition 
for anomaly cancellation then becomes 


+y =z 


Rather famously, this equation has no solutions: this is the result of Fermat’s last 
theorem. 


What about chiral gauge theories with N = 4 Weyl fermions? Now we have two 
options: we could take three positive charges and one negative and look for positive 
integers satisfying 


T +y +H Sw (3.58) 


The simplest integers satisfying this are 3,4,5 and 6. Mathematicians have constructed 
a number of different parametric solutions to this equation, although not one that gives 
the most general solution. The simplest is due to Ramanujan, 


£= 3n? +5nm— 5m? , y=4n?—4nm4+ 6m? (3.59) 


z= 5n? —5nm-—- 3m , w= 6n? — 4nm +4m’ 
with n and m positive integers. 


We can also construct chiral gauge theories with N = 4 Weyl fermions by having 
two of positive charge and two of negative charge, so that 


r+y=2+w? (3.60) 


= [50'= 


This equation is also closely associated to Ramanujan and the famous story of G. H. 
Hardy’s visit to his hospital bed. Struggling for small talk, Hardy commented that 
the number of his taxicab was particularly uninteresting: 1729. Ramanujan responded 
that, far from being uninteresting, this corresponds to the simplest four dimensional 
chiral gauge theory, since it is the first number that can be expressed as the sum of two 
cubes in two different ways: 13 + 12? = 93 + 103. The most general solution to (3.60) 
is known. Some of these can be generated by putting m = n + 1 into the Ramanujan 
formula (3.59) which, for n > 3, gives x < 0, and so yields solutions to (3.60) rather 
than (3.58) 


Avoiding the Mixed Gravitational Anomaly 


So far, we have been concerned only with cancelling the 
gauge anomaly. However, if we wish to place our theory on = 
curved spacetime, then we must require that the mixed gauge- 
gravitational anomaly (3.49) also vanishes. For this, the di- grav 
agram shown in the figure must also vanish when summed 


over all fermions, requiring Figure 31: 


N 
Y Omt (3.61) 


Note that the diagram with two gauge fields and a single graviton vanishes because 
diffeomorphism symmetry is a non-Abelian group, and the trace of a single generator 
vanishes. 


Our goal now is to find a set of charges which solve both (3.57) and (3.61)". Let’s 
first see that these cannot be satisfied by a set of N = 4 integers. To show that there 
can be no solutions with three positive integers and one negative, we could either plug 
in the explicit solution (3.59) or, alternatively use (3.61) to write w = x + y + z which 
then implies that w? > x? + y’ + 2? in contradiction to (3.58). To see that no taxicab 
numbers can solve (3.61), write one pair as x, y = a+b and the other pair as z, w = cd 
with a,b,c,d € YA Then (3.61) tells us that a = c, while (3.57) requires b = d. 


It turns out that some questions we can ask about the solutions to (3.57) and (3.61) 
are hard. For example if you fix N it may be difficult to determine if there is a solution 
with a specified subset of charges. In contrast, it is straightforward to classify solutions 
if we place a bound, |Q,| < q on the charges. Consider the set of charges 


{Qa} = (ie, gilda) a qd} 


"I’m grateful to Imre Leader for explaining how to solve these equations. 
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where we use notation that dp is the multiplicity of the charge p if dp > 0, while |d,| 
is the multiplicity of —p if dp < 0. This notation has the advantage of removing any 
non-chiral matter since we can’t have both charges p and —p. The two conditions (3.57) 
and (3.61) become 


q q 
r4 =0 and X pa, =0 (3.62) 
p=1 p=1 


This can be thought of as specifying two q-dimensional vectors which lie perpendicular 
to dp. Solutions to these linear equations for d, € Z span a (q — 2)-dimensional lattice. 
Each lattice point corresponds to a solution with the number of fermions given by 


N = Joi Idy]. 


Now we can address the question: what is the simplest chiral gauge theory. Of course, 
the answer depends on what you mean by “simple”. For example, you may want the 
theory that contains the lowest charge q. In this case, the answer is the set of N = 10 
fermions with charge 


CAm. 
Alternatively, you may instead want to minimize the number of Weyl fermions N in 


the theory. The smallest solutions to (3.57) and (3.61) have N = 5 Weyl fermions. 
There are many such solutions, but the one with the lowest q is 


{Qa} = {1 5—7 — 8, 9} 


In general, the trick of changing the non-linear diophantine equations (3.57) and (3.61) 
into the much simpler linear equations (3.62) means that it is simple to generate con- 
sistent chiral Abelian gauge theories. 


Finally, to paraphrase Coleman, if you want your Hilbert space to contain structures 
capable of knowing joy, then the set of N = 15 fermions with charges {16 28], 312, 4531, 6} 
is a good place to start; we’ll see the importance of these charges in Section 3.4.4. 


3.4.2 Non-Abelian Gauge Anomalies 


We now turn to non-Abelian gauge theories with gauge group G. We have to worry 
about the familiar triangle diagrams, now with non-Abelian currents on each of the 
external legs: 


EARRAS A AL 


= TO = 


The anomaly must be symmetric under v + A, and this symmetry then imposes itself 
on the group structure. The result is that a Weyl fermion in a representation R, with 
generators 7’, contributes a term to the anomaly proportional to the totally symmetric 
group factor 


ae 3 0 = tr T{T*, T°} 


Furthermore, left and right-handed fermions contribute to the anomaly with opposite 
signs. 


We will consider a bunch of left-handed Weyl fermions, transforming in representa- 
tions Rz, with i = 1 ..., Nz and a bunch of right-handed Weyl fermions transforming 


in Re; with j = 1,..., Npr. The requirement for anomaly cancellation is then 
Ny Nr 
So a™(Rz:) = X d™(Rpr;) (3.63) 
i=1 j=1 


As long as the gauge group is simply laced (i.e. contains no U (1) factors) then there is 
no analog of the mixed gauge-gravitational anomaly (3.61) because tr T° = 0. 


How can we satisfy (3.63)? One obvious way is to have an equal number of left- and 
right-handed fermions transforming in the same representations of the gauge group. A 
prominent example is QCD, which consists of G = SU(3), coupled to Ny = 6 quarks, 
each of which is a Dirac fermion. For such vector-like theories, there is no difficulty 
in assigning mass terms to fermions which fits in with our theme that anomalies are 
associated only to massless fermions. 


There are other, straightforward ways to solve (3.63). The anomaly vanishes for any 
representation that is either real (e.g. the adjoint) or pseudoreal (e.g. the fundamental 
of SU(2)). Here “pseudoreal” means that the conjugate representation T@ is related to 
the original T° by a unitary matrix U, acting as 


T=UT U= 


If we denote a group element by e’°"?" then, in the conjugate representation, the same 
group element is given by e~’°"7"". This means that the conjugate representation can 
be written as J* = —T** = —(T,)’, where the last equality follows because we can 
always take T° to be Hermitian. The upshot of these arguments is that, for a real or 


pseudoreal representation, 


te OP ae PT Se (Pe (TS Y = er et 
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where the final equality comes from the fact that tr A = tr AT. We learn that for any 
real or pseudoreal representation tr T7{T°, T°} = 0. Once again, this tallies nicely with 
the fact that anomalies are associated to fermions that are necessarily massless, since 
we can always write down a Majorana mass term for fermions in real representations. 


The only gauge groups that suffer from potential anomalies are those with complex 
representations. This already limits the possibilities: we need only worry about gauge 
anomalies in simply laced groups when 


SU(N) with N > 3 
G = SO(4N +2) 
Es 


We should add to this list G = U(1) which we discussed separately in the previous 
section. 


The list of gauge groups which might suffer perturbative gauge anomalies is short. 
But it turns out that it is shorter still, since the anomaly coefficient tr T*{T°, T°} 
vanishes for both G = Es and G = SO(4N + 2) with N > 2. (Note that the Lie 
algebra so(6) = su(4) so this remains.) We learn that we need only care about these 
triangle anomalies when 


G = SU(N) with N >3 


Interestingly, these are the gauge groups which appear most prominently in the study 
of particle physics. 


Let’s now look at solutions to the anomaly cancellation condition (3.63). At first 
glance, this look as if it is a tensor equation and if each representation R had a different 
tensor structure for d@° is would be tricky to solve. Fortunately, that is not the case. 
One can show that 


d™(R) = A(R) d°(N) 
where N is the fundamental representation of SU(N). The coefficient A(R) is some- 
times called simply the anomaly of the representation. To see this, first note that we 


have 


A(R, © R2) = A(Rı) + A(Rə) (3.64) 
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But an arbitrary representation can be constructed by taking tensor products of the 
fundamental. The representation R; ® Rə is generated by 1, Q TY + TY @ 12, so we 
have 


Finally, note that our calculation above tells us that A(R) = —A(R). 


The formulae (3.64) and (3.65) allow us to compute the anomaly coefficient for 
different representations providing that we know how to take tensor products. Consider, 


for example, representations of G = SU(3). By definition A(3) = —A(3) = 1. If we 
use the fact that 3 @ 3 = 6 @ 3 then we have 


A(6) = A(3 @ 3) = A(3) = 3A(3) + 3A(3) — A(3) = 3 +3 -(-1) =7 


Similarly, 3 & 3 = 8 @ 1, which gives 
A(8) = 3A(3) + 3A(3) — A(1) = 3+ (-3) -0=0 
as expected since the adjoint 8 is a real representation. 


What is the Simplest Non-Abelian Chiral Gauge Theory? 


A chiral gauge theory is one in which the left-handed and right-handed Weyl fermions 
transform in different representations of the gauge group. This prohibits a tree-level 
mass term for the fermions, since it is not possible to write down a fermion bilinear. 
Theories of this type comprise some of the most interesting quantum field theories, 
both for theoretical and phenomenological reasons. (We’ll see a particularly interesting 
chiral gauge theory in Section 3.4.4.) Notably, there are obstacles to placing these 
theories on the lattice, which means that we have no numerical safety net when trying 
to understand their strong coupling dynamics. 


We can use our results above to construct some simple non-Abelian chiral gauge 
theories. One can show that the anomaly coefficients for the symmetric and anti- 


symmetric H representations are: 


A(O) =N+4 and A(H)=N-—4 


From this, we learn that we can construct a number of chiral gauge theory by taking, 
for N > 5, 


G = SU(N) with a H and N — 4 O Weyl fermions 


— 164- 


where O is shorthand for the anti-fundamental. Alternatively, we could have, for N > 3, 


G = SU(N) witha TiandN+40 


or 


G = SU(N) with a COO, a H and 2N O 


The simplest of these theories is: 
SU(5) with a 5 and 10 (3.66) 


This is a prominent candidate for a grand unified theory, incorporating the Standard 
Model gauge group and one generation of matter fields. We’ll return to these chiral 
gauge theories in Section 5.6.4 where we describe their likely dynamics. 


Alternatively, we can build a chiral gauge theory by taking either Eg or SO(4N +2) 
with complex representations, where the anomaly coefficients all vanish. The simplest 
such example is SO(10) with a single Weyl fermion in the 16 spinor representation. 
This too is a prominent candidate for a grand unified theory. 


The chiral gauge theories described above are the simplest to write down. But it 
turns out that there is one chiral gauge theory which has fewer fields. This will be 
described in section 3.4.4. But first there is one further consistency condition that we 
need to take into account. 


3.4.3 The SU(2) Anomaly 


The list of gauge groups that suffer a perturbative anomaly does not include G = SU (2). 
This is because all representations are either real or pseudoreal. For example, the 
fundamental 2 representation, with the generators given by the Pauli matrices a, is 
pseudoreal. In agreement with our general result above, it is simple to check that 


d™ = wels’ eo} = 0 


This would naively suggest that we don’t have to worry about anomalies in such theo- 
ries. But this is premature. There is one further, rather subtle anomaly that we need 
to take into account. This was first discovered by Witten and, unlike our previous 
anomalies, cannot be seen in perturbation theory. It is a non-perturbative anomaly. 


Here is the punchline. An SU(2) gauge theory with a single Weyl fermion in the 
fundamental representation is mathematically inconsistent. Furthermore, an SU (2) 
gauge theory with any odd number of Weyl fermions is inconsistent. To make sense 
of the theory, Weyl fermions must come in pairs. In other words, they must be Dirac 
fermions. 
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To see why, let’s start with a theory which makes sense. We will take a Dirac fermion 
WV in the fundamental representation of SU(2). The partition function in Euclidean 
space is, schematically, 


= 1 = 
Z = J Dupara exp (- fats za TE Fw +i pw) 
g 


1 
= [ra det(iP) exp (- fas sat PF) 


This determinant is an infinite product over eigenvalues of iJ) and, as such, we have 
to regulate this product in a gauge invariant way. We met one such regularisation in 
3.2.2 where we discussed the measure in the path integral. Another simple possibility 
for a Dirac fermion is Pauli-Villars regularisation. 


Let’s now repeat this for a Weyl fermion. For concreteness, let’s take a left-handed 
fermion %. Following (3.10), we have the path integral, 
_ il : 
Z= f DYDYDA exp (- f d'r — tr F” Fu + o" D a) 
2g° 


Integrating out the fermions, it looks like we’re left with the object det(io”D,). But 
this is rather subtle, because io” D, doesn’t map a vector space back to itself; instead 
it maps left-handed fermions onto right-handed fermions. To proceed, it’s best to think 
of the Weyl fermion as a projection Y = $(1+7°)W. We then have 


5 
Z= [PA det (w : — ) exp (- fate EF) (3.67) 
g 


As we discussed in Section 3.3.1, 7 is a Hermitan operator and therefore has real 


eigenvalues. The existence of the y°? matrix ensures that these eigenvalues come in + 
pairs, 
iDbn =Xnbn => iP bn) =A bn) 


Let us assume that we have a gauge potential with no zero eigenvalues. Then the 
spectrum of eigenvalues of iJ) looks something like that shown on the left-hand axis of 
the figure below. Formally, det(i D) = [ [„ An. To define the determinant det(i P(1 + 
7°)/2), we should just take the product over half of these eigenvalues. In other words, 


det (w+ +) = det"? (ip) 


This formula is intuitive because a Dirac fermion consists of two Weyl fermions. Our 


job is to make sense of it. The diffculty is that there is a + ambiguity when we take 
the square-root det! ? (D). This, as we will see, will be our downfall. 
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Let’s try to define a consistent sign for our determinant spectrum 


dett? (ip). To do so, we need to pick half of these eigen- -—tté—i 
values in a consistent way. Here is how we will go about it. 

We start with some specific gauge configuration Aj. For this te ee 
particular choice, we define det!/? (iP) to be the product of 0 ——— 
the positive eigenvalues only, throwing away the negative -—_ ——* 
eigenvalues. As we vary the A, away from Aj, we follow 
this set of preferred eigenvalues and continue to take their A Ae 
product. It may be that as we vary A,, some of these cho- i H 
sen eigenvalues cross zero and become negative. Whenever Figure 32: 


this happens, det!/ ?(iPD) changes sign. If we’re lucky, this 
method has succeeded in assigning a particular to sign to det!/ *(iJD) for each configu- 
ration A,,. 


Now we come to the important question: is our choice of sign gauge invariant? In 
particular, suppose that we start with a gauge connection A, and smoothly vary it until 
we come back to a new gauge connection which is gauge equivalent to a the original, 


Ay + A? = Q(2) A, Q(x) + 12(2)d,.Q7 (1) 


For our theory to be consistent, we need that the sign of dett? (ip) is the same for 
these two gauge equivalent configurations. If this fails to be true, then the integral over 
A,, in the partition function (3.67) will give us Z = 0 and our theory is empty. 


How could this fail to work? We know that the total spectrum of DP is the same for 
gauge equivalent configurations. The concern is that as we vary smoothly from A,, to 
A an odd number of eigenvalues may cross the origin, as shown in the figure. This 
would result in a change to the sign of the determinant. 

To proceed, we need to classify the kinds of gauge transformations Q(x) that we 
can have. We will consider gauge transformations such that Q(x) => 1 as z > ov. 
This effectively compactifies R* to S* and all such gauge transformation provide a map 
Q:S*+5 SU(2). These maps are characterised by the homotopy group 


meuo =z, (3.68) 
Note that in our discussion of instantons in Section 2.3 we used II3(SU(2)) = Z. 
That’s fairly intuitive to understand because SU(2) S S3, so the third homotopy group 


counts winding from a 3-sphere to a 3-sphere. The fourth homotopy group about is 
less intuitive®: it tells us that there are topologically non-trivial maps from Sf to S°. 


’Higher homotopy groups only get more counter-intuitive! See, for example, the Wikipedia article 
on the homotopy groups of spheres. 
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The homotopy group (3.68) means that all SU(2) gauge transformations fall into two 
classes: trivial or non-trivial. We will see that under a non-trivial gauge transformation 


dett? (ip) => — det! (ip) (3.69) 


This is the non-perturbative SU(2) anomaly that renders the theory inconsistent. 
(Rather annoyingly, because the anomaly is related to the global structure of the gauge 
group, it is sometimes referred to as a “global anomaly”, even though it is an anomaly 
in a gauge symmetry instead of a global symmetry.) 


Follow the Eigenvalue 


It remains to show that det!/ *(iD) indeed flips sign under a non-trivial gauge trans- 
formation as in (3.69). To do so, we consider a gauge connection A on the 5d space 
Ms = R x St. We parameterise the R factor by 7 and work in a gauge with A, = 0. 
Meanwhile, for u = 1, 2,3, 4 labelling a direction on S* we choose a gauge configuration 
such that 


Alz, T) > A lz) as T+ —oo (3.70) 
and 
Alz, T) > A? (a) as T — +00 (3.71) 


Our 5d gauge field A(x, t) smoothly interpolates between a 4d gauge configuration at 
T — —oo and a gauge equivalent configuration at T — +00, related by a non-trivial 
gauge transformation. 


We now consider the five-dimensional Dirac operator 


Ov 
DU = F= + Dw 
T- 


The operator P; is both real and anti-symmetric. (Both the spinor representation 
of SO(5) and the fundamental representation of the gauge group SU(2) are pseudo- 
real, but their tensor product is real.) There are two possibilities for the eigenvalues 
of such an operator: either they are zero, or they are purely imaginary and come 
in conjugate pairs. This means that as we vary the gauge connection A,, and the 
eigenvalues smoothly change, the number of zero eigenvalues can only change in pairs. 
The number of zero eigenvalues, mod 2, is therefore a topological invariant. 


This Zə topological invariant can be computed by a variant of the Atiyah-Singer index 
theorem. For any gauge configuration with boundary conditions (3.70) and (3.71), the 
index theorem tells us that the number of zero modes is necessarily odd. 
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Let’s now see why this Zo index of the five-dimensional Dirac operator P, tells us 
that the determinant necessarily flips sign as in (3.69). Any zero mode of the P obeys 


Ow = 
| pw (3:72) 


We will assume that the gauge configuration A,,(x,7) varies slowly enough in 7 that 
we can use the adiabatic approximation for the eigenfunctions. This means that the 
eigenfunction U(x,7) can be written as 


W(z,7) = f(r) o(x;7) 
where, for each fixed 7, $(x;7) is an eigenfunction of the 4d Dirac operator 


Y Pelz; T) = An(T)9(z,7) 
In this adiabatic approximation, the zero mode equation (3.72) becomes 


SADHA > H= rew(- far’ x79) 


But f(T) must be normalisable. This requires that A(T) > 0 as T + +00, but A(T) < 0 
as T > —O0. 


We learn that for every normalisable zero mode of P;, there must be an eigenvalue 
of the four-dimensional Dirac operator J which crosses from positive to negative as we 
vary T. Since the index theorem tells us that there are an odd number of zero modes, 
there must be an odd number of eigenvalues that cross the origin. And this, in turn, 
means that the determinant flips sign under a non-trivial gauge transformation as in 
(3.69). This is why SU(2) gauge theory with a single Weyl fermion — and, indeed, 
with any odd number of Weyl fermions — is inconsistent. 


Other Gauge Groups 


Although advertised here as an anomaly of SU(2) gauge groups, the same argument 
holds for any gauge group with non-trivial I4. This is not relevant for other unitary 
or orthogonal groups: II,(SU(N)) = 0 for N > 3 and II4(SO(N)) = 0 for all N > 5. 
However, SU(2) is also the start of the symplectic series: SU(2) = Sp(1). More 
generally, 


Ily(Sp(N)) = Zə for all N 


The same arguments as above tell us that Sp(V) with a single Weyl fermion in the 
fundamental representation has a non-perturbative anomaly and is therefore mathe- 
matically inconsistent. 
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3.4.4 Anomaly Cancellation in the Standard Model 


We saw earlier how to build chiral, non-Abelian gauge theories with gauge group 
SU(N). The simplest of these is the SU(5) grand unified candidate (3.66). How- 
ever, it turns out that there is a chiral gauge theory which is simpler than this, in the 
sense that it has fewer fields. This theory has gauge group 


G=U(1) x SU(2) x SU(3) 


We denote the chiral matter as (R4, Roə)y where R; and R» are the representations 
under SU(2) and SU(3) respectively, and the subscript Y denotes the U(1) gauge 
charge. The left- and right-handed fermions transform as 


Left-Handed: lg: (2,1)-3 , qr: (2,3)441 
Right-Handed: er: (1,1)-6 , ur: (1,3)44 , dr: (1,3)-2 (3.73) 


This is perhaps the most famous of all quantum field theories, for it describes the world 
we live in. It is, of course, the Standard Model with a single generation of fermions. 
(It is missing the Higgs field and associated Yukawa couplings which do not affect the 
anomalies. Note also that we have chosen a normalisation so that the U(1) hypercharges 
are integers; this differs by a factor of 6 from the conventional normalisation.) Here lz 
are the left-handed leptons (electron and neutrino) and epg is the right-handed electron. 
Meanwhile, qz is the left-handed doublet of up and down quarks while ug and dp are 
the right-handed up and down quarks. We may add to this a right-handed neutrino vp 
which is a singlet under all factors of G. 


Let’s see how anomaly cancellation plays out in the Standard Model. First the 
non-Abelian anomalies. The [SU(3)]* diagram is anomaly free because there are two 
left-handed and two right-handed quarks. Similarly, there is no problem with the non- 
perturbative SU(2) anomaly because there are 4 fermions transforming in the 2. 


This leaves us only with anomalies that involve the Abelian factor. Here things 
are more interesting. The U(1)* anomaly requires that the sum of charges X` Y’ — 
Š righi Y? = 0. (In all of these calculations, we must remember to multiply by the 
dimension of the representation of the non-Abelian factors). We have 


UO : E x (-3)3 +6 x (+1)°] = |(-6)* +3 x (4) +3 x (-2)3] =0 


where we have arranged left- and right-handed fermions into separate square brackets. 
We see already that the cancellation happens in a non-trivial way. Similarly, the mixed 


ee 


U(1)-gravitational anomaly tells us that the sum of the charges Jiet Y — rign Y = 0 
must vanish 


U(1) x gravity? : |2 x (-3) +6| e | -6+3x4+3x (-2) =ý 


Finally, we have the mixed anomalies between two factors of the gauge group. The 
non-Abelian factors must appear in pairs, otherwise the contribution vanishes after 
taking the trace over group indices. But we’re left with two further anomalies which 
must cancel: 


[SU(2))2 x U(1): -3+3 x (41) =0 
[SU(3)2 x U(1): 2x (+1) — E = 2| si 


We see that all gauge anomalies vanish. Happily, our Universe is mathematically con- 
sistent! 


The Standard Model is arguably the simplest chiral gauge theory that one can write 
down (at least with a suitable definition of the word “simple”). It is rather striking 
that this theory is the one that describes our Universe at energy scales < 1 TeV or so. 


Could it have been otherwise? 


There are alternative games that we can play here. For example, we could take the 
matter fields of the Standard Model, but assign then arbitrary hypercharges. 


lL : (2, 1), qL : (2,3)4, eR: (1, 1)z, UR: (1,3)u, dR $ (1,3)a 


We then ask what values of the hypercharges {l,q, x, u,d} give rise to a consistent 
theory? We have constraints from the non-Abelian anomalies: 


[SU (2)? x U(1): 3q+l1=0 
[SU(3)?? x U(1): 2g-u-—d=0 (3.74) 


and the Abelian purely Abelian anomaly 
VUA: 69? + 21° — 3u? — 3d? — 2? = 0 (3.75) 


On their own, these are not particularly restrictive. However, if we also add the mixed 
gauge-gravitational anomaly 


U(1) x gravity?: 6q+ 21—3(u+d)—-r=0 (3.76) 
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then it is straightforward to show that there are only two possible solutions. The first 
of these is a trivial, non-chiral assignment of the hypercharges, 


q=l=x=0 and u= -d (3.77) 
The second is, up to an overall rescaling, the charge assignment (3.73) seen in Nature, 
x = 21 = —3(u + d) = —6q and u -— d = 6q 


This is interesting. Notice that we didn’t insist on quantisation of the hypercharges 
above, yet the restrictions imposed by anomalies ensure that the resulting hypercharges 
are, nonetheless, quantised in the sense that the ratios of all charges are rational. 


We could also turn this argument around. Suppose that we instead insist from the 
outset that the hypercharges {l, q, x,u, d} take integer values. This is the statement 
that the U (1) gauge group of the Standard Model is actually U (1), rather than R. We 
can use the first equation in (3.74) to eliminate 1 = —3q. The first equation in (3.74) 
tells us that the sum u + d is even which means that the difference is also even: we 
write u — d = 2y. The cubic U(1)? anomaly equation (3.75) then becomes 


x + 18qy” + 54q? = 0 (3.78) 


We now want to find integer solutions to this equation. There is the trivial solution 
with x = q = 0; this gives us (3.77). Any further solution necessarily has q # 0. 
Because (3.78) is a homogeneous polynomial we may rescale to set q = 1 and look for 
rational solutions to the curve 


r? +184? +54=0 xz,yEQ (3.79) 


This is a rather special elliptic curve. To see this, we introduce two new coordinates 
v,w € Q, defined by 


6 3(u — w) 


9 


v+ w 7 v+w 
This reveals the elliptic curve (3.79) to be the Fermat curve 
+w =l 


Any non-trivial rational solution to this equation would imply a non-trivial integer 
solution to the equation v? + w = z3. Famously, there are none. The trivial solutions 
are v = 1,w = 0 and v = 0,w = 1. These reproduce the hypercharge assignments 
(3.73) of the Standard Model. 


Sia 


Notice that at no point in the above argument did we make use of the mixed gauge- 
gravitational anomaly. We learn that if we insist quantised hypercharge then consistent 
solutions of the gauge anomalies are sufficient to guarantee that the mixed gauge- 
gravitational anomaly is also satisfied. This is rather unusual property for a quantum 
field theory. 


It is well known that the Standard Model gauge group and matter content fits nicely 
into a grand unified framework — either SU(5) with a 5 and 10; or SO(10) with a 16 
— and it is sometimes said that this is evidence for grand unification. This, however, 
is somewhat misleading: the matter content of the Standard Model is determined 
mathematical consistency alone. To find evidence for grand unification, we must look 
more dynamical issues, such as the running of the three coupling constants. 


Global Symmetries in the Standard Model 


The Standard Model consists of more than just the matter content described above. 
There is also the Higgs field, a scalar transforming as (2, 1)3, and the associated Yukawa 
couplings. After the dust has settled, the classical Lagrangian enjoys two global sym- 
metries: baryon number B and lepton number L. The charges are: 


| lr q er UR dR VR 
Bo = 0 § 3 O 
BE 0h: ae 30 0. A 


Both B and L are anomalous. There is a contribution from both the SU(2) gauge 
fields, and also from the U(1) hypercharge. For the latter, the anomaly is given by 


S BY? -S° BY? = 


left right 


(6 — 3 x 4? — 3 x (—2)”) = —18 


w| = 


and 


XO LY? -X LY? =2x (-3)? = -18 


left right 


Note, however, the anomalies for B and L are the same. This is true both for the 
mixed anomaly with U(1)y — as shown above — and also for the mixed anomaly with 
SU (2). This means that the combination B — L is non-anomalous. It is the one global 
symmetry of the Standard Model. 


We still have to check if there is a gravitational contribution to the B — L anomaly. 
This vanishes only if there is a right-handed neutrino. 
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A More General Chiral Gauge Theory 
The Standard Model is the start of a 2-parameter family of chiral gauge theories, with 
gauge group 

G = U(1) x Sp(r) x SU(N) 


with N odd. The matter content is a generalisation of (3.73), except there are now r 
copies of each of the right-handed fermions, including the right-handed neutrino. The 
chiral fermions transform in the representations 


Left-Handed: lg: (2r,1)-n, qr: (2r,N)41 
Right-Handed: (ea)r: (1,1)-20n, (Valr: (1,1) (20-2) 
(Uo) R : (1, N)14(20-1)N ; (da)r : (1, N)1-(2e-1)N 


For r = 1 and N = 3, the matter content coincides with that of the Standard Model. 
One can check that all mixed gauge and gravitational anomalies vanish for arbitrary 
integer r and odd integer N. 


3.5 °t Hooft Anomalies 


So far we have classified our anomalies into two different types: anomalies in global 
symmetries (which are interesting) and anomalies in gauge symmetries (which are fa- 
tal). 


However, a closer look at the triangle diagrams suggests a better classification of these 
anomalies. Global anomalies (like the chiral anomaly) have a single global current and 
two gauge currents on the vertices of the triangle. They are better thought of as mixed 
global-gauge anomalies. What we have called gauge anomalies have gauge currents on 
all three vertices. But here too we have seen examples with mixed anomalies between 
different gauge symmetries. 


This begs the question: do we gain anything by thinking about triangle diagrams 
with global symmetries on all three vertices? If the sum over triangle diagrams does 
not vanish, then the global symmetry is said to have a t Hooft anomaly. 


A global symmetry with a ’t Hooft anomaly remains a symmetry in the quantum 
theory. The charges that you think are naively conserved are, indeed, conserved. You 
only run into trouble if you couple the symmetry to a background gauge field, in which 
case the charge is no longer conserved. You run into real trouble if you try to couple 
the symmetry to a dynamical gauge field because then the ’t Hooft anomaly becomes 
a gauge anomaly and the theory ceases to make sense. In other words, the ’t Hooft 
anomaly is an obstruction to gauging a global symmetry. 
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We've already met examples of global symmetries with a ’t Hooft anomaly above. 
For example, a free Dirac fermion has two global symmetries U(1)y and U(1),, and 
there is a mixed ’t Hooft anomaly between the two. 


So far, it doesn’t sound like a ’t Hooft anomaly buys us very much. However, a 
very simple and elegant argument, due to ’t Hooft, means that these symmetries can 
be rather powerful tool to help us understand the dynamics of strongly coupled gauge 
theories. Suppose that we have some theory which, at high-energies, has a continuous 
global symmetry group Gp (here F stands for “flavour” ). We are interested in the low- 
energy dynamics and, in particular, the spectrum of massless particles. For strongly 
coupled gauge theories, this is typically a very hard problem. As we’ve seen in Section 
2, the physical spectrum need not look anything like the fields that appear in the 
Lagrangian. In particular, the quarks that appear at high-energies are often confined 
into bound states at low-energies. In this way, seemingly massless fields may get a 
mass through quantum effects. Conversely, it may be that some of these confined 
bound states themselves turn out to be massless. In short, the spectrum rearranges 
itself, often in a dramatic fashion, and we would like to figure out what’s left at very 
low energies. 


The ’t Hooft anomaly doesn’t solve this question completely, but it does provide a 
little bit of an insight. Here is the key idea: we gauge the global symmetry Gp. This 
means that we introduce new gauge fields coupled to the G p-currents. Now, as we 
explained above, the ’t Hooft anomaly means that such a gauging is not possible since 
the theory will no longer be consistent. To proceed, we must therefore also introduce 
some new massless Weyl fermions which do not interact directly with the original fields, 
but are coupled only to the Gp gauge fields. Their role is to cancel the Gp anomaly, 
rendering the theory consistent. We will call these new fields spectator fermions. 


What is the dynamics of this new theory? We choose the new gauge coupling to be 
very small so that these gauge fields do not affect the massless spectrum of the original 
theory. In particular, if the new Gp gauge field itself becomes strongly coupled at some 
scale Anew, we will pick the gauge coupling so that Anew is much smaller than any other 
scale in the game. The upshot is that at low energies — either in the strict infra-red, 
or at energies F > Anew — there are two choices: 


e The symmetry group Gr is spontaneously broken by the original gauge dynamics. 


In this case, the original theory, in which Gp is a global symmetry, must have 
massless Goldstone modes. 
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e The symmetry group Gr is not spontaneously broken. In this case, we are left 
with a Gr gauge theory which must be free from anomalies. By construction, 
the spectator fermions contribute towards the Gr anomaly which means that the 
low energy spectrum of the original theory must contain extra massless fermions 
which conspire to cancel the anomaly. This gives us a handle on the spectrum of 
massless fermions and is known as ’t Hooft anomaly matching. 


The essence of anomaly matching is that one can follow the anomaly from the ultra- 
violet to the infra-red. If the ’t Hooft anomaly in the ultra-violet is Ayy then the 
spectator fermions must provide an anomaly Agpectator Such that 


Auv Eg A spectator =0 


But if the symmetry survives in the infra-red, the anomaly persists. Now the massless 
fermions may look very different from those in the UV — for example, if the theory 
confines then they will typically be bound states — but they must contribute Arp to 
the anomaly with 


Arr + A spectator =0 > Auv = Arr 


The anomaly is special because it is an exact result, yet can be seen at one-loop in 
perturbation theory. 


Anomaly matching has many uses. The standard application is to a SU(N) gauge 
theory coupled to Ny massless Dirac fermions, each in the fundamental representation. 
This is a vector-like theory, so doesn’t suffer any gauge anomaly. The global symmetry 
of the classical Lagrangian is 


Gp = U(Nys)r x U(N;)rR 


where each factor acts on the left-handed or right-handed Weyl fermions. However, 
we've seen in Section 3.1 that the chiral anomaly means that the axial U(1), does not 
survive in the quantum theory. The non-anomalous global symmetry of the theory is 


Gp = U(l)y x SU(Ny)t x SU(N} )R 


We can see immediately that Gp is likely to enjoy a ’t Hooft anomaly since the SU (N+) 
factors act independently on left- and right-handed fermions. The question is: what 
does this tell us about the low-energy dynamics of our theory? The answer to this ques- 
tion will be the topic of Section 5, so we will delay giving the full analysis until Section 
5.6 where we will show that often there is no confined bound state spectrum which can 
reproduce the ’t Hooft anomaly in Gr. This means that Gr must be spontaneously 
broken, and there are massless Goldstone bosons in the theory. 
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An Aside: Symmetry Protected Topological Phases 


In condensed matter physics, there is the notion of a symmetry protected topological 
(SPT) phase. We won’t describe this in detail, but provide a few words to explain how 
this is related to ’t Hooft anomalies. 


An SPT phase is a gapped phase which, if we disregard the global symmetry, can be 
continuously connected to the trivial phase. However, if we insist that we preserve the 
global symmetry structure then it is not possible to deform an SPT phase into a trivial 
theory without passing through a quantum phase transition. 


SPT phases can be rephrased in the language of ’t Hooft anomalies. An SPT phase 
in spatial dimension d has a global symmetry G such that, when placed on a manifold 
with boundary, the (d—1)-dimensional theory on the boundary has a ’t Hooft anomaly 
for G. 


3.6 Anomalies in Discrete Symmetries 


In this section, we turn to a slightly different topic: anomalies in discrete symmetries. 
Unlike our previous examples, these will have nothing to do with chiral fermions, or 
ultra-violet divergences in quantum field theory. Instead, our main example is an 
anomaly in pure Yang-Mills theory. 


I should mention up front that this material is somewhat more specialised than the 
rest of this chapter. We will need to invoke a whole bunch of new machinery which, 
while fun and interesting in its own right, will not be needed for the rest of these lectures. 
And, at the end of the day, we will only apply this machinery to learn something new 
about SU(N) Yang-Mills at 6 = 7. 


For those who are nervous that the effort is worth it, here is the gist of the story. 
Recall from Section 2.6 that there are (at least) two different versions of SU(N) Yang- 
Mills theory that differ in the global structure of the gauge group. These are G = 
SU(N) and G = SU(N)/Zy. Moreover, as we explained previously, the theta angles 
take different ranges in these two cases: 


G=SU(N) = 6€[0,2z) 
G=SU(N)/Zy => 0€(0,270N) 


The discrete symmetry that we’re going to focus on is time reversal. As explained in 
Section 1.2.5, under time reversal 0 — —0. This means that the theory with 0 = 0 is 


Sit = 


invariant under time reversal. But so too is the theory when @ takes half its range, i.e. 
the time-reversal invariant values are 


0=r when G= SU(N) 
0=rN when G= SU(N)/Zyn 


Clearly these differ. This means that if we start with G = SU(N) and 0 = 7m then 
we have time reversal invariance. If we subsequently “divide the gauge group by Zy” 
(whatever that means) keeping 0 unchanged, we lose time reversal invariance. This 
smells very much like a mixed ’t Hooft anomaly: we do something to one symmetry 
and lose the other. Roughly speaking, we want to say that there is a mixed ’t Hooft 
anomaly between time reversal and the Zy centre symmetry of the gauge group. 


It turns out that the language above is not quite correct. There is a mixed ’t Hooft 
anomaly, but it is between rather different symmetries, known as generalised sym- 
metries. We will describe these in Section 3.6.2 below. But first it will be useful to 
highlight how a very similar ’t Hooft anomaly arises in a much simpler example: bosonic 
quantum mechanics. 


3.6.1 An Anomaly in Quantum Mechanics 


Many of the key features of discrete anomalies appear already in the quantum mechanics 
of a particle moving on a ring, around a flux tube. This is an example that we first met 
in the lectures on Applications of Quantum Mechanics when introducing the Aharonov- 
Bohm effect. We also briefly introduced this system in Section 2.2.3 of these lectures 
when discussing the theta angle. 


We start with the Lagrangian 
L=— ee (3.80) 


where we take the coordinate x to be periodic x € [0, 27). This describes a particle 
of mass m moving around a solenoid with flux 0. (We’ll also see this same quantum 
mechanical system arising later in Section 7.1 when we consider electromagnetism in 
d = 1 + 1 dimensions compactified on a spatial circle.) 


The theta term is a total derivative. This ensures that it does not affect the equations 
of motion and so plays no role in the classical system. However, famously, it does change 
the quantum theory. To see this, we introduce the momentum 

OL 6 
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n=0 n=1 n=2 


Figure 33: The energy spectrum for a particle moving around a solenoid. 


in terms of which, the Hamiltonian reads 


es ae eee Eee 2g 
= am es In} 2m Ox 2r 


where, in the second equality, we’ve used the canonical commutation relations [x, p| = 7. 


It is simple to solve for the spectrum of this Hamiltonian. We will ask that the 
wavefunctions are single-valued in x. In this case, they are given by 


where the requirement that w is single valued around the circle means that we must 
take n € Z. Plugging this into the time independent Schrodinger equation Hy = Ew, 
we find the spectrum 


The spectrum is shown in the figure as a function of 6. The key point is that the 
spectrum remains invariant under 0 — 0 + 27. However, it does so by shifting all the 
states |n) + |n +1). This is an example of spectral flow. 


The fact that our system is periodic in 0 will be important. Because of this, here 
are two further explanations. First, the path integral. Consider the Euclidean path 
integral with temporal S! parameterised by 7 € [0, 8). Then the field configurations 
include instantons, labelled by the winding number of the map z : St > S}, 


f drô-x=2rk kez 
s1 


Because the 6-term has a single time derivative, it comes with a factor of i in the 
Euclidean path integral, which is weighted by e’* with k € Z. We see that the 
partition function is invariant under 0 —> 0 + 2r. 
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Next, Hamiltonian quantisation. Here, the fact that Hg and Họ+2r are equivalent 
quantum systems can be stated formally by the conjugation 


ix —ix 
e” Hee" = Ho+on 


Note that the operator e’? is particularly natural. Indeed, the classical periodicity of 
x really means that x is not a good quantum operator; instead, we should only work 
with e*”. 

Symmetries 


It will prove useful to describe the symmetries of the model. First, for all values of 0, 
there is an SO(2) = U(1) symmetry which, classically, acts as translations: x > x+a. 
In the quantum theory, we implement this by the operator Tẹ, with a € [0, 27). It acts 
on operators as 


T et? a = ettet 
and on states as 
Tan =e ma 


For the two special values 0 = 0 and 0 = 7, the system also enjoys a parity symmetry 
which acts classically as P : x + —2x. In the quantum theory, this acts on the operator 
as 


Pe®*P=e™ with P?=1 


One could also view this as charge conjugation since it flips the charge of the particle 
moving around the solenoid; in addition, the theory has an anti-unitary time-reversal 
invariance at 6 = 0 and 7 but this does not seem to buy us anything new. 


The action of parity on the states depends on whether 0 = 0 or 0 = m. Let’s look at 
each in turn. 


6 =0: Here we have P : |n) > |—n). There is a unique ground state, |0}, so parity 
is unbroken. However, all higher states come in pairs |), related by parity. We can 
now look at the interplay of parity and translations. It is simple to see that 


PT gaa 


Mathematically, the SO(2) symmetry and Zə combine into O(2) = Zə x SO(2) where 
the semi-direct product x is there because, as we see above, P and Tẹ do not commute. 
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6 =: Now there are two ground states: |0) and |1). They have different charges 
under translations, with 


T.|0) =|0) and Tall) = e’|1) 


Clearly the action of parity can no longer be the same as when 0 = 0, because the 
states |n) and |—n) are not degenerate. Instead, parity now acts as 


P: |n) > |-n +1) 


In particular, P|O0) = |1) and P|1) = |0}. This shift also shows up when we see how 
parity mixes with translations. We now have 


PT, P =T a 


This is no longer the group O(2); it is sometimes referred to as the central extension 
of O(2). Said slightly differently, we have a projective representation of O(2) on the 
Hilbert space H of the theory. We can define a representation of O(2) on the rays 
H/C*, but this does not lift to a representation on the Hilbert space itself. 


0 #0,a: When @ does not take a special value, there is no Zə symmetry and a 
unique ground state. For 0 < m, the ground state is |0); for 0 > 7 is is |1). 
Coupling to Background Gauge Fields 


For the chiral anomaly, the breakdown of the symmetry showed up most clearly when 
we coupled to background gauge fields (3.34). Our quantum mechanical example is no 
different. We turn on a background gauge field for the U(1) symmetry x > z +a. This 
means that we return to our original Lagrangian (3.80) and replace it with the action 


0 
Son = f dt (i Ao)? 


27 
This Lagrangian is invariant under the symmetry x > x + a(t) and Ap > Ao — A(t). 


(a + Ag) + pAo 


We’ve also included an extra term, pAp. This is an example of a quantum mechanical 
Chern-Simons term. (We’ll spend some time discussing the d = 2 + 1 version of this 
termin Section 8.4.) We’ve already encountered terms like this before in Section 2.1.3, 
where we argued that it was compatible with gauge invariance provided 


pEZ 
Our new action is not quite invariant under 0 + 0 + 27. We now have 
Sean = So p+1 


Equivalently, we should identify (0, p) ~ (0 + 27, p — 1). 
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Now let’s look at the action of parity. We still have x + —2x, but now this must now 
be augmented by P : Ao > —Ap. At 0 = 0, this is still a good symmetry of the theory 
provided that p = 0. However, at 0 = 7, we have a problem. The action of parity maps 
0 = T to 0 = —7 and p > —p. We then need to shift 0 back to m which, in turn, shifts 
p — p—1. In other words, 


P: (0, p) z (1, p) S? (=7, =D) Di (T, —Pp— 1) 


But there is no p € Z for which —p — 1 = p. This fact that the Chern-Simons levels 
necessarily differ after parity means that the theory is not parity invariant at 0 = 7: it 
suffers a mixed ’t Hooft anomaly between parity and translations. 


The Partition Function 


Here is yet another way to say the same thing. Let’s consider the Euclidean partition 
function, with Euclidean time St of radius 8. We introduce the chemical potential 
J drAo = u. Large gauge transformations mean that u ~ u + 2r. 


We can compute the partition function 
Zanes hee 
where Q is the U(1) charge of the state. We will compute the partition function at 


0 =. For our purposes it will suffice to focus on the ground states |0) and |1) which 
we take to have E = 0. These have charges Q = 0 and Q = 1 respectively. We have 


i 
Z ground =1+e" 


Under parity, we have P : u — —p. We see again that the partition function is not 
invariant under parity, 4 — —p. This is not surprising: the two states have different 
charges under the U(1) symmetry. 


There is, however, once again a loophole. The two states |0) and |1) have charge 
that differs by 1. We can make the theory parity invariant if we assign these two states 
with charges +3. The partition function is then 


—iu/2 iu/2 —iu/2 
Znew = € n/ + e =C uy ŽZøround 


Now we have a partition function that is invariant under parity. But there’s a price 
we’ve paid: it is no longer invariant under u —> u +2r. This is reminiscent of the story 
of chiral fermions, where we could shift the anomaly between the U(1)y and U(1),4 
symmetries. 
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Adding a Potential 


So far we’ve argued that there is a subtle interplay between parity and translations 
when 0 = 7, which we can think of as a ’t Hooft anomaly. But what is it good for? 
As we now explain, anomalies of this kind can be used to restrict the dynamics of the 
theory. 


So see this, we remove the background gauge field but, in its place, turn on a potential 
for x. Clearly any potential must be invariant under x — x + 27. However, we will 
request something more: we will ask that the potential is invariant under £ > x + 7. 
For example, we consider the potential 

pa e+ Ta + A cos(2x) 
2 2T 
This has two classical ground states at x = 0 and x = m. Moreover, the U (1) translation 
symmetry is broken to 


This means that at 6 = 0 and 0 = m we have two discrete symmetries: Tp : £ > £ +T 
and P : £ > — 1. 


At 0 = 0, the operators obey the algebra T P = PT,,. This is the algebra Zə x Zo. 


But at 0 = 7 there is a subtlety. The central extension means that these generators 
obey 


PT, P = -T, (3.81) 


We can define the two elements a = P and b = T,P. These obey a? = 1 and b? = 
T,PT,P = —1 so that b* = 1. Also, we have aba = b7}. This is the Dg algebra; it is 
the symmetries of rotations of a square. 


The Dg algebra can’t act on a single ground state. In particular, if both T, and P 
act as phases on a state, then we can’t satisfy the algebra (3.81). That means that the 
quantum mechanics must have two ground states for all values of A. We can reach the 
same conclusion for any potential that retains T, as a symmetry. 


This argument is slick, but it is powerful. Usually we learn that double-well quantum 
mechanics has just a single ground state, with the two classical ground states split by 
instantons. The argument above says that this doesn’t happen in the present situation 
when 0 = r. This is perhaps rather surprising. At a more prosaic level, it arises because 
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there are two instantons which tunnel between the two vacua, one which goes one way 
around the circle and one which goes the other. At 6 = 7, these two contributions 
should cancel. 


3.6.2 Generalised Symmetries 


We want to build up to understanding discrete anomalies in Yang-Mills theory. How- 
ever, the anomalies turn out to lie in a class of symmetries that are a little unfamiliar. 
These go by the name of generalised symmetries. 


We will first discuss generalised global symmetries (as opposed to gauge symmetries). 
We’re very used to dealing with such symmetries as acting on fields or, more generally, 
local operators of the theory. We have both continuous and discrete symmetries. Con- 
tinuous symmetries have an associated current J which is a 1-form obeying d*J = 0. 
(In contrast to the rest of the lectures, throughout this section we use the notation of 
forms.) The charge is constructed from J, together with a co-dimension 1 submanifold 
M C X of spacetime X, 


Q= f “J (3.82) 
M 
This charge then acts on local operators, defined at a point x, by 
elp (x) = R' o (2) 
where R’ j 18 the generator of the group element and x € M. 


If we have a discrete symmetry, there is no current but, nonetheless, the generator is 
still associated to a co-dimension one manifold. We will refer to both continuous and 
discrete symmetries of this type as 0-form symmetries. These are the usual, familiar 
symmetries of quantum field theories that we have happily worked with our whole lives. 


The idea of a generalised symmetry is to extend the ideas above to higher-form sym- 
metries. We define a q-form symmetry to be one such that the generator is associated 
to a co-dimension q + 1 manifold M C X. If the symmetry is continuous, then there is 
a q + l-form current J and the generator can again be written as (3.82). 


For q > 0, these generalised symmetries are always Abelian. This follows from the 
group multiplication, Qa (M)Qa (M). When q = 0, the manifolds M are co-dimension 
one and we can make sense of this product by time ordering the manifolds M. For 
q > 0, there is no such ordering. This means that the operators must all commute with 
each other. 


— 184- 


A q-form symmetry acts on an operator associated to a q-dimensional manifold C. 
Here our interest lies in 1-form symmetries. These act on line operators such as the 
Wilson and ’t Hooft lines. Take, for example, a Wilson line W. The action of a 1-form 
symmetry takes the form QW = rW where r is a phase and the manifolds M and C 
have linking number 1. 


Generalised Symmetries in Maxwell Theory 


Our ultimate interest is in generalised symmetries in Yang-Mills theory. But it will 
prove useful to first discuss generalised symmetries in the context of pure Maxwell 
theory. 


There are two 2-forms which are conserved. Each can be thought of as the current 
for a global 1-form symmetry 


2 
Electric 1-form symmetry: J° = — F (3.83) 
g 
1 
Magnetic 1-form symmetry: J” = 5 E 
T 


Each of these currents is conserved, in the sense that they obey d*J = 0. The electric 
1-form symmetry shifts the gauge field by a flat connection: A — A + da. In contrast, 
the action of the magnetic 1-form symmetry is difficult to see in the electric description; 
instead, it shifts the magnetic gauge field A by a flat connection. Relatedly, the electric 
1-form symmetry acts on Wilson lines W; the magnetic 1-form symmetry acts on ’t 
Hooft lines T. 


The fate of these symmetries depends on the phase of the theory which, as explained 
in Sections 2.5 and 2.6, is governed by the Wilson and ’t Hooft line expectation values. 
These typically give either area law, or perimeter law. We will say: 


Area law: (W) ~ e74 


Perimeter law: (W) ~ e™” 


W= 


= 0 
=> (W) #0 
This may look a little arbitrary, but it is a natural generalisation of what we already 
know. A traditional, 0-form symmetry, is said to be spontaneously broken if a charged 
operator O has expectation value lim),—y|,.0(O(x)O(y)) = (O(x))(O(y)) # 0. In other 
words, the expectation value depends only on the edge points x and y. The analogy 


for a 1-form symmetry is that the expectation value depends only on the perimeter. 
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With this convention, in the Coulomb phase we have (WY # 0 and (T) Æ 0, so that 
both symmetries are spontaneously broken. But a broken global symmetry should give 
rise to an associated massless Goldstone boson. This is nothing but the photon itself, 


(0| Fuvle, p) mo (Dv = Eppu )e” 

This gives a rather surprising new perspective on an old question. Whenever we have 
massless degrees of freedom, there is usually some underlying reason. For massless 
scalar fields, Goldstone’s theorem typically provides the reason. But we see that we 
can also invoke Goldstone’s theorem to explain why the photon is gapless: we just need 
to extend its validity to higher form symmetries. 


We can also think about the fate of these symmetries when we add matter to the 
theory. Suppose, first, that we introduce charged electric degrees of freedom. This 
explicitly breaks the electric one-form symmetry since d*J ~ d*F which no longer 
vanishes because the Maxwell equations now have a source. However, the magnetic 
symmetry, which follows from the Bianchi identity, survives. It is spontaneously broken 
in the Coulomb phase, but unbroken in the Higgs phase. Moreover, here we have 
magnetic vortex strings described in Section 2.5.2, that carry charge under the 1-form 
symmetry. 


In contrast, if we introduce magnetic degrees of freedom then only the electric 1-form 
symmetry survives. This is broken in the Coulomb phase, but unbroken in the Higgs 
phase where the confining electric strings carry charge. 


There is a variant of this. Suppose that we add electrically charged matter but with 
charge N. Then there is a Zy electric 1-form symmetry which shifts the gauge field by 
a flat connection with Zy holonomy which leaves the matter invariant. In the Coulomb 
phase, both this symmetry and the magnetic 1-form symmetry are broken, as before. 
But something novel happens in the Higgs phase where the gauge symmetry breaks 
U(1) > Zyn. Now (W) # 0 reflecting the fact that the Zy electric 1-form symmetry is 
spontaneously broken, while the magnetic 1-form symmetry survives. Alternatively, we 
could also add charge 1 monopoles which condense, so that the gauge theory confines. 
Now (W) = 0 but (W) 40 since the dynamical matter can screen, causing the string 
to break. We see that the Zy electric 1-form symmetry is unbroken in this phase. 


The various dynamics on display above suggests the following relationship: 
Spontaneously broken 1-form symmetry H = Unbroken gauge symmetry H 


This is interesting. A discrete gauge symmetry in the infra-red is a form of topolog- 
ical order. This is because, when compactified on non-trivial manifolds, we can have 
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flat connections for this discrete gauge symmetry — which is another way of saying 
holonomy around cycles. These flat connections can then give rise to multiple ground 
states. 


Generalised Symmetries in Yang-Mills 


Finally, we turn to our main topic of interest. We will study the generalised symmetries 
in Yang-Mills theory with two different gauge groups, G = SU(N) and SU(N)/Zy. 
The latter group is sometimes referred to as PSU(N) = SU(N)/Zy. Much of what 
we have to say will be a recapitulation of the ideas we saw in Section 2.6.2 regarding 
’t Hooft and Wilson lines, now viewed in the language of generalised symmetries. 


G = SU(N) 


The Abelian story above has a close analog in non-Abelian gauge dynamics. We start 
by considering the case of simply connected gauge group, G = SU(N). We can have 
Wilson lines in all representations of G, with charges lying anywhere in the electric 
weight lattice. If we denote the Wilson line in the fundamental representation by W, 
this means that we have W! for all l = 1,2,.... In contrast, the ’t Hooft lines must 
carry charges in the magnetic root lattice. If we denote the “fundamental” ’t Hooft 
line as T, this means that we only have T” and multiples thereof. 


As long as there is no matter transforming under the Zy centre of SU(N), then the 
theory also has an electric Zy one-form symmetry. This acts by shifting the gauge field 
by a flat Zy gauge connection or, equivalently, inducing a holonomy in the Zy centre 
of SU(N). Another way of saying this is that the Wilson line W picks up a phase w 
with w under this 1-form symmetry. 


When the theory lies in the confining phase, the Zy 1-form symmetry is unbroken. 
Here we have (W) ~ e~4, with A the area of the loop, and the theory has electric flux 
tubes which, due to the absence of fundamental matter, cannot break. These electric 
flux tubes are Zy strings which carry charge under the Zy one-form symmetry. 


This theory also has a different phase. We can access this if we introduce scalar fields 
@ transforming in the adjoint of the gauge group, so that the Zy one-form symmetry 
remains. Then by going to a Higgs phase with (¢) 4 0, we have (W) ~ e7}, with L the 
perimeter of the loop. Now the Zyn symmetry is broken. Correspondingly, there are no 
electric flux tubes in this phase. However, we now have a topological field theory at 
low energies because G = SU(N) —> Zy, so a discrete Zy gauge symmetry remains. 
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Summarising, we can view the Wilson line as an order parameter for the electric 
one-form symmetry 


unbroken if (W) ~ e74 


Electric Zy one-form symmetry: 
broken if (W) ~ e~” 


A broken Zy one-form symmetry gives rise to a Zy gauge symmetry. 


G = SU(N)/Zn 


Now let’s consider how this story changes when G = SU(N)/Zy. The Wilson lines 
are now restricted to lie in the electric root lattice, so only multiples of W survive. 
In contrast, the whole range of ’t Hooft lines T’ with | = 1,2,... are allowed. (Strictly 
speaking, this is true at 0 = 0; we'll look at the role of the 0 angle below.) 


The theory now has a magnetic Zy one-form symmetry, whose order parameter is 
the ’t Hooft line T. We have 


unbroken if (T) ~ e74 


Magnetic Zy one-form symmetry: 
broken if (T) ~ e7} 


So this magnetic Zy one-form symmetry is broken in the confining phase, resulting in 
an emergent Zy magnetic gauge symmetry. 


3.6.3 Discrete Gauge Symmetries 


We’re going to need one final piece of technology for our story. This is the idea of a 
gauge symmetry based on a discrete, rather than continuous, group. 


It’s tempting to think of a gauge symmetry as something in which the transforma- 
tion can take different values at different points in space. But this approach clearly 
runs into problems for a discrete group since the transformation parameter cannot vary 
continuously. Instead, we should remember the by-now familiar mantra: gauge sym- 
metries are redundancies. A discrete gauge symmetry simply means that we identify 
configurations related by this symmetry. 


There is a simple, down-to-earth method to arrive at a discrete gauge theory: we start 
with a continuous gauge theory, and subsequently break it down to Zyn. Indeed, we 
already saw two examples of this above. In the first, we start with U(1) gauge theory, 
with a scalar of charge N. Upon condensation, we have U(1) > Zy. Alternatively, we 
could take SU(N) gauge theory with adjoint Higgs fields, giving rise to SU(N) > Zy. 
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Here we take the U(1) gauge theory as our starting point. We can focus on the phase, 
@ € [0, 27) of the scalar field. We have a gauge symmetry 


o>o+Na 
where œa ~ a + 27 is also periodic. In the Higgs phase, the scalar kinetic term is 
Lı = t?(dd — NA) A*(d¢ — NA) 


for some t € R which is set by the expectation value of the scalar. In the low-energy 
limit, t? > œ and we have A = xd which tells us that the connection must be flat. 
However, something remains because the holonomy around any non-contractible loop 
canbe + f A E€ ZZ. 


It is useful to dualise ġ. We do this by first introducing a 3-form H and writing 


i 
Lis = —— H A*H + —H A (dd-NA 

15 (4m) i 2T ae ) 
Integrating out H through the equation of motion *H = 4rit? (dọ — NA) takes us back 
to the original Lagrangian £,. Meanwhile, if we send t? > oo at this stage, we get the 
Lagrangian 


Lig > —HA(db—NA) 
20 


where H now plays the role of a Lagrange multiplier, imposing A = xd. Alternatively, 
we can instead integrate out ¢ in £1.5. The equation of motion requires that dH = 0. 
This means that we can write H = dB locally. We’re then left with the Lagrangian 


N 
fya H\*H+—BAdA 
T 


(4r)?t2 
In the limit t? > 00, this becomes 


N 
Berm NTA 
2T 


This Lagrangian is known as BF theory. It is deceptively simple and, as we have seen 
above, is ultimately equivalent to a Zy discrete gauge symmetry. Our task now is to 
elucidate how this works. The subtleties arise from the fact that the two gauge fields 
have quantised periods, so when integrated over appropriate cycles yield 


| Fe2nZ and H E€ 2nZ 
g2 


y3 
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The BF theory has two gauge symmetries: A — A + da and B + B + dà. However, as 
we've seen, the U(1) gauge theory for A is actually Higgsed down to Zy, a fact which 
is clear in our initial formulation in £1, but less obvious in the BF theory formulation. 
Similarly, the 1-form gauge symmetry for B is also Higgsed down to a Zy 1-form gauge 
symmetry. To see this, we dualise A. We first add a Maxwell term for F = dA and 
consider the Lagrangian 


1 i A 
= — F A*F — —F ^ (dA- NB 
Los J2 ^A on ( ) 


If we integrate out the 1-form A, we recover the fact that F = dA locally. Note that if 
we send e? — oo, to remove the Maxwell term, we’re left with 


Pes -ZF ^ (dÂ — NB) (3.84) 


where F now plays the role of a Lagrange-multiplier 2-form. Alternatively, we can 


instead integrate out F using its equations of motion *F = — (dA — NB) to get 
«Sr? 


L3 (dA — NB) \*(dA — NB) 


This now takes a similar form as the action £; that we started with. We should view 
the dual gauge field A as a matter field which is charged under B. Correspondingly, 
the U(1) 1-form gauge symmetry is Higgsed down to Zy. 


What we learn from this is that a Zy discrete gauge theory also comes with a Zy 
1-form gauge symmetry. 
The Operators 
Our theory has two gauge symmetries, under which 


¢>¢+Na and A->A+da 
A>SAENX: and B>B+dÀ 


As we’ve seen, both are Higgsed down to Zy. Nonetheless, all operators that we write 
down must be invariant under these symmetries. Examples of such operators include 


dp- NA~*H and dÂ-NB~*F (3.85) 


where the equations of motion show that these are actually related to the dual fields 
H and F respectively. However, these are all trivial in the theory. To find something 
more interesting, we must turn to line and surface operators. 
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There are two electric operators, a Wilson line W4[C] and a “Wilson surface”, Wg[S], 


Wa[C] = exp ( [ A) and W,[S] = exp ( f B) 


As usual, the Wilson line describes the insertion of a probe particle of charge 1 with 
worldline C. Meanwhile, the Wilson surface describes the insertion of a vortex string 
with worldsheet S. The scalar ¢ has winding f dọ = 27 around the vortex which, using 
A= xd, means that the vortex string carries magnetic flux 1/N. A particle of charge 
1 picks up a holonomy 27/N through the Aharonov-Bohm effect. This is captured in 
the correlation function 


(Walc WSI} = exp (Fn(c, 8)] 


where n(C, S) is the linking number of C and S. This correlation function is the non- 
trivial content of the Zy gauge theory. We see, in particular, that the operators WẸ 
and WẸ are both trivial in the sense that they commute with all other operators. This 
can also be understood by a Zy gauge transformation which takes a general operator 


WC] = exp (i [ A) 


and shifts q > q+ N. Note that we can also think of this as a Zy global 1-form 
symmetry. Because (W4[C]} ~ e-”, this 1-form symmetry is spontaneously broken, 
in agreement with our previous discussion that this should accompany a Zy gauge 
symmetry. 


One might think that there are also ’t Hooft operators in the theory, constructed by 
exponentiating the gauge invariant operators (3.85). The magnetic gauge field dual to 


A 


A is A, and we can write 


T4[C, S] = exp (i f A-i f 8) (3.86) 


where, now, © is a surface which ends on the line C. The insertion of a ’t Hooft line is 
equivalent to cutting out a tube S? x R around C and imposing J F = 27. However, 
the operator T4[C, S] is trivial in the theory. First, note that the attached surface 
operator has charge N and so is invisible. Moreover, by a gauge transformation we can 
always set A=0 locally. The real meaning of the ’t Hooft operator T4|C, S] is simply 
that N Wilson surface operators can end on a line. 
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We can view this in a slightly different way. Suppose that there are magnetic 
monopoles of charge 1 under the U(1) gauge symmetry. This gauge symmetry is Hig- 
gsed which means that these monopoles are attached to strings. But the minimum 
string has charge 1/N, so the monopole is attached to N strings. 


An analogous operator can be constructed using the magnetic dual to B. We have 


PO ses (iP) aN [ A) 


where now Č is a line which ends at the point P. The same arguments as above mean 
that this operator is also trivial. It is telling us only that N Wilson line operators can 
end at a point. 


3.6.4 Gauging a Zy One-Form Symmetry 


Finally we can start to put the pieces together. Recall that G = SU(N) Yang-Mills 
has a Zy global electric one-form symmetry that acts on Wilson lines. We will show 


that if we promote this one-form symmetry to a gauge symmetry then we end up with 
G = SU(N)/Zw Yang-Mills. 


We can also play this game in reverse. Starting with G = SU(N)/Zy Yang-Mills, we 
can gauge the global magnetic one-form symmetry to return to G = SU(N) Yang-Mills. 


To this end, let’s start with SU(N) Yang-Mills. We have a proliferation of gauge 
fields of various kinds, and we’re running out of letters. So, for this section only, we 
will refer to the SU(N) gauge connection as a. We will couple this to a BF theory 
which we write in the form (3.84), 


Lae = 5 f ZA(aV - NB) 
2T 


The trick is to combine the SU(N) gauge connection a with the U (1) gauge connection 
V to form a U(N) S (U(1) x SU(N))/Zy connection 


l- 
= —V 1 
A oor ay N 


Here’s what’s going on. We could try to construct a flat connection a from a SU(N) /Zy 
bundle which is not an SU(N) bundle. This is not allowed in the SU(N) theory. 
However, we can compensate this with a gauge connection V which would not be 
allowed in a pure U(1) theory. The obstructions cancel between the two, so we’re left 
with a good U(N) gauge connection. We then define the U(V) field strength 


G=dA+ANAA 


(25 


This field strength is not invariant under the 1-form gauge symmetry of the BF theory, 
namely V > V + N\A and B —> B+ då; it transforms as 


G>G+d\ 


This means that we can’t simply write down the usual Yang-Mills term for G. Instead, 
we need to form the gauge invariant combination G — B and write the action 
1 i 5 
EA afo (G= BJAG- B)+ 4 f ZA(d®-NB) (3.87) 
Note that we have set the theta term to zero here because it comes with its own story 


which we will tell later. To see what’s happening, we can look at the line operators. 
We started with an SU(N) gauge theory with Wilson line 


WIC] = Tr Pexp ( [ a) (3.88) 


However, this is not invariant under the U (N) gauge transformations that lie in SU(N)/Zyn 
rather than SU(N). So we need to augment it to get a gauge invariant operator. The 
obvious thing to do is to replace a with the U(N) connection A, but now this fails to 
be gauge invariant under the 1-form symmetry. To resolve this, we need to work with 


WIC, £] = W[C] exp ($ fv- fe) 


where 0% = C. This is now gauge invariant, but it comes with its own woes because 
it’s not a line operator but a surface operator, depending on the choice of X. To get 
an honest line operator, we need to take 


WIC, £] = W” |C] exp g vin fB) 
C 3 
As before, the constraint from integrating out Z tells us that N f B= = dA. But on 
any closed manifold, f dA € 27Z. This means that the line operator WY [C, £] doesn’t 
really depend on the choice of X. But this is exactly the class of Wilson lines which 
are allowed in SU(N)/Zw. 


From our discussion in the previous section (and in Section 2.6.2), we know that 
the SU(N)/Zy theory has more ’t Hooft lines that the SU(N) theory that we started 
from. These are easy to write down in our new formulation: they are 


T[C] = exp ( [ z) (3.89) 
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The Theta Term 


Now let’s add a theta term into the game. One of the key distinctions between SU(N) 
and SU(N)/Zw Yang-Mills is that 6 € [0, 27) in the former, while 6 € [0,27N) in the 
latter. How does this distinction arise when transforming from one theory to another? 


We start by writing the obvious, gauge invariant theta term 
i0 


Pe 872 


Tr (G — B) A (G — B) 
where 0 € [0, 27r). Under the shift 0 > 0 + 27, we apparently have 
, , N 
Ass= 7 [Tgag-> |mTgaB+ = [BAB 
4T 2T 4T 


The equation of motion for Z tells us that Tr G = dV = NB. Using this relation, we 
have 


. a 
Ase= 7 [Tgag- È [BAB 
Ar Ar 


The first term above is an integer multiple of 27, so we have 
iN 
AS = —— | BAB +2niZ 
4n 
We see that the action isn’t invariant under the shift 6 — 0 + 27 but, as we’ve seen 
in other contexts, what we really care about is e?. And this too is not quite invariant, 
but shifts by a contact term for B. For this reason, we augment our theta angle action 
to become 
i0 ipN 
SVG Shi = be eae (3.90) 
81°? 4r 
We will ultimately see that p plays the role of a discrete theta angle. First, we note 
again that the effect of sending 0 > 0 + 27 is 


p>p-l1 


At first glance, the B A B term doesn’t look gauge invariant under shifts B > B + dà. 
But this is misleading: the term is gauge invariant provided that p € Z. Indeed, our 
original 0 term is manifestly gauge invariant, so this contact term must also be. To see 
this explicitly, note that under a gauge transformation, we have 


‘oN oN oN aN 
2 feag n A BARS AB e ber 
Ar Ar 2T An 
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Here the 1-form has f dà € 27Z which means that the last term is an integer multiple 
of 27. (Actually, for N even this is true, while for N odd it is true only on spin 
manifolds.) Meanwhile, using the constraint NB = dV, we also have f Be (Qn/N)Z, 
so the second term is also an integer multiple of 27 and the partition function is gauge 
invariant. 


Finally, note that this same integrality constraints means that + f BAB € (27/N?)Z. 
This means that the discrete theta angle p in (3.90) can take values 


p=0,1,..., N =1 
As we would expect. The theta angle of the SU(N)/Zy theory will be 
Osu(N)/zn = 2Tp + 0 € [0, 27 N) (3.91) 
in agreement with our earlier discussion in Section 2.6.2. 


We would next like to see how the discrete theta angle p shifts the electric charge 
of °t Hooft lines. First there is a fairly straightforward, albeit slightly handwaving 


ipN ip ~ ~ 

— | BAB=— 

A y A ay fav nav 
then we see that this looks like a standard theta term 6 = 2rp/N for V. This will give 
electric charge to ’t Hooft lines of V which are, equivalently, the Wilson lines of the 


argument. If we rewrite 


dual gauge field V. These are precisely the operators (3.89) which we identified as the 
new emergent ’t Hooft lines of the SU(N)/Zwy theory. 


There is a more direct way to see this. We can also directly require that Z transforms 
under the 1-form gauge symmetry as 


Z => Z + pdr (3.92) 


The integrality condition f Z € 27Z and f dà € 27Z is retained if p € Z. This renders 
the theory gauge invariant without imposing the constraint dV = NB. With the gauge 
transformation on Z, we see immediately that the ’t Hooft lines (3.89) are no longer 
gauge invariant, transforming as T[C] > e? Jc àT[C]. To compensate, we’re forced to 
use the line operators 


T[C] = T[C] Tr P exp (—i [ A) 


This is the dyonic line operator, in which the magnetic ’t Hooft line picks up an electric 
charge. This is precisely the expected effect of the discrete theta angle. 
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3.6.5 A ’t Hooft Anomaly in Time Reversal 


It’s been rather a long road to put together all the machinery that we need. But, 
finally, we can put these ideas together to tell us something new. 


We sketched the main idea at the beginning of this section. We start with G = SU(N) 
Yang-Mills which, as we now know, enjoys a Zy global, electric one-form symmetry. 
At two special values 6 = 0 and 0 = ~ it also enjoys time reversal invariance, as we 
reviewed in Section 1.2.5. 


Suppose that we work in the theory with 6 = 0. If we gauge the Zy one-form 
symmetry, then we find ourselves left with the G = SU(N)/Zy Yang-Mills theory, now 
with Osu(N)/zy = 27p with p the discrete theta angle that appeared in (3.91). We are 
always free to pick p = 0 and we end up with theory which preserves time reversal 
invariance. 


However, life is different if we sit at 0 = m. Now if we gauge the Zy one-form 
symmetry, we’re left with the G = SU(N)/Zy Yang-Mills, but now with 


Asu(n)/Zy = (2p +1) 


For some p € Z. This theory is time reversal invariant only when @su(n)/z)y = 0 and 
9su(N)/ZN =7N. 


Let’s first consider N even. In this case, there is no choice of p € Z for which our 
final theory is time reversal invariant. We learn that if we start with 6 = 7 and then 
we can gauge the Zy one-form symmetry at the cost of losing time reversal invariance. 
In other words, we have a mixed ’t Hooft anomaly between the Zy one-form symmetry 
and time reversal. 


So what are the consequences? Importantly, this anomaly must be reproduced in the 
low-energy physics. At 0 = 0, we expect Yang-Mills theory to be in a gapped, boring 
phase, with nothing interesting going on beyond the strong coupling scale Agcp. But 
this cannot also be the case at 6 = m: whatever physics occurs there has to account for 
the anomaly. There are three options: the first two options are entirely analogous to 
our discussion of ’t Hooft chiral anomalies in Section 3.5, but the third is novel: 


e Time reversal invariance is spontaneously broken at 0 = 7. This means that the 
theory is gapped, but with two degenerate ground states. There can be domain 
walls between these two states. 
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Note that there is a theorem, due to Vafa and Witten, which says that parity 
cannot be spontaneously broken in vector-like gauge theories, but this theorem 
explicitly applies only at 6 = 0. 


e The theory is gapless at 0 = 7, and the resulting theory reproduces the discrete 
’t Hooft anomaly. 


e The theory is topological at 0 = 7. This means that it is gapped, with no low- 
energy propagating degrees of freedom, but still has interesting things going on. 
One way to probe the subtle behaviour of the theory is to place it on a non-trivial 
background manifold. For example, the number of ground states depends on the 
topology of the manifold 


What about when N is odd? Here it looks as if we are in better shape, because we 
can always pick p = (N — 1)/2 to end up with @syiwy/zy = Na. This means that, 
strictly speaking, there is no ’t Hooft anomaly in this case. However, there is a global 
inconsistency, because there is no choice of p which preserves time reversal for both 
0 = 0 and 0 = 7. If we assume that the theory is confining, gapped and boring when 
0 = 0 then there is always the possibility that the theory undergoes a first order phase 
transition as we vary 0 from 0 to 7. However, if there is no such phase transition, then 
the theory at 0 = m must again be non-trivial, in the sense that it falls into one of the 
three categories listed above. Thus, in the absence of a first order phase transition, 
there is no difference between N even and N odd. 


So which of these possibilities occurs? We don’t know for sure, but we can take some 
hints from large N. In Section 6.2.5, we will show that when N > 1, the first option 
above occurs, and time reversal is spontaneously broken at 6 = m. There is a general 
expectation that this behaviour persists for most, if not all, NV, simply on the grounds 
that it appears to be the simplest option. 


There is, however, one tantalising possibility for G = SU(2) Yang-Mills. It has been 
suggested that the theory at 0 = v is actually gapless, and its dynamics is described by 
a single U(1) gauge field. We currently have no way to determine whether this phase 
is realised, or if time reversal is again spontaneously broken. 


3.7 Further Reading 


The anomaly is one of the more subtle aspects of quantum field theory. Like much of 
the subject, it has its roots in a combination of experimental particle physics, and a 
healthy dose of utter confusion. 
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The story starts with an attempt to understand the decay of the neutral pion 7° into 
two photons. (This story will be told in more detail in Section 5.4.3.) The neutral pion 
is uncharged, so does not couple directly to photons. In 1949, Steinberger suggested 
that the decay occurs through a loop process, with the SU(2) isospin triplet of pions 
T? coupling to the proton and neutron doublet N through the interaction 


Gant No50" N (3.93) 


The resulting amplitude gets pretty close to the measured pion decay rate of 10716 s. 
It appeared that all was good. 


The trouble came some decades later with the realisation that the pion is a Goldstone 
boson. (We will explain this when we discuss chiral symmetry breaking in Section 5.) 
This means that couplings of the form (3.93) are not allowed: the pion can have only 
derivative couplings. Indeed, one can show that if all the symmetries of the classical 
Lagrangian hold, then a genuinely massless pion would be unable to decay into two 
photons [190, 198]. The previous success in predicting the decay of the pion suddenly 
appeared coincidental. 


The anomaly provides the resolution to this puzzle, as first pointed by in 1969 by Bell 
and Jackiw [16] (yes, that Bell [15]) and, independently, by Adler [2]. The extension to 
non-Abelian gauge groups was made by Bardeen in the same year [12]. (At this point 
in time, his dad had only one Nobel prize.) 


The gravitational contribution to the chiral anomaly was computed as early as 1972 
by Delbourgo and Salam [39]. The fact that anomalies cancel in the Standard Model 
was first shown in [82, 21], albeit phrased as avoiding a lack of renormalisability rather 
than avoiding a fatal inconsistency. (In fairness, non-renormalisability was thought to 
be fatal at the time.) 


The first hint that the anomaly was related to something deeper can first be seen in 
a proof, by Adler and Bardeen, that it is one-loop exact. But the full picture took some 
years to emerge. The relation between instantons and the anomaly was first realised 
by °t Hooft [101], and the connection to the Atiyah-Singer index theorem was made in 
[114]. 


The path integral approach that we described in these lectures is due to Fujikawa 
and was developed ten years after the anomaly was first discovered [68, 69]. This was, 
perhaps, the first time that properties of the path integral measure were shown to play 
an important role in quantum field theory; this has been a major theme since, not least 
with Witten’s discovery in 1982 of the SU (2) anomaly [225] 
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Excellent reviews of anomalies can be found in lectures by Bilal [18] and Harvey [90]. 


The idea of a ’t Hooft anomaly as an important constraint on low energy physics was 
introduced by ’t Hooft in the lectures [105]; its application to chiral symmetry breaking 
will be described in Section 5.6. 


Section 3.6 on anomalies in discrete symmetries contains somewhat newer material. 
Discrete gauge symmetries have a long history on the lattice and, in the continuum, 
were discussed in the a number papers studying geometry through the lens of QFT. 
The presentation of BF given here was largely taken from [11] and generalised higher 
form symmetries from [70]. The fact that these higher form symmetries can have mixed 
anomalies with discrete symmetries, such as time reversal, was described in [71]. (The 
theorem which says that time reversal or parity cannot be spontaneously broken at 
0 = 0 can be found in [196].) The quantum mechanics analogy of a particle on a circle 
is taken from the appendix of [71]. 
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4. Lattice Gauge Theory 


Quantum field theory is hard. Part of the reason for our difficulties can be traced to 
the fact that quantum field theory has an infinite number of degrees of freedom. You 
may wonder whether things get simpler if we can replace quantum field theory with a 
different theory which has a finite, albeit very large, number of degrees of freedom. We 
will achieve this by discretizing space (and, as we will see, also time). The result goes 
by the name of lattice gauge theory. 


There is one, very practical reason for studying lattice gauge theory: with a discrete 
version of the theory at hand, we can put it on a computer and study it numerically. 
This has been a very successful programme, especially in studying the mass spectrum 
of Yang-Mills and QCD, but it is not our main concern here. Instead, we will use lattice 
gauge theory to build better intuition for some of the phenomena that we have met in 
these lectures, including confinement and some subtle issues regarding anomalies. 


There are different ways that we could envisage trying to write down a discrete 
theory: 


e Discretize space, but not time. We could, for example, replace space with a cubic, 
three dimensional lattice. This is known as Hamiltonian lattice gauge theory. 


This has the advantage that it preserves the structure of quantum mechanics, so 
we can discuss states in a Hilbert space and the way they evolve in (continuous) 
time. The resulting quantum lattice models are conceptually similar to the kinds 
of things we meet in condensed matter physics. The flip side is that we have 
butchered Lorentz invariance and must hope that it emerges at low energies. 


This is the approach that we will use when we first introduce fermions in Section 
4.3. But, for other fields, we will be even more discrete... 


e Discretize spacetime. We might hope to do this in such a way that preserves some 
remnant of Lorentz invariance, and so provide a natural discrete approximation 
to the path integral. 


There are two ways we could go about doing this. First, we could try to construct 
a lattice version of Minkowski space. This, it turns out, is a bad. Any lattice 
clearly breaks the Lorentz group. However, while a regular lattice will preserve 
some discrete remnant of the rotation group SO(3), it preserves no such remnant 
of the Lorentz boosts. The difference arises because SO(3) is compact, while 
SO(8,1) is non-compact. This means that if you act on a lattice with SO(3), 
you will come back to your starting point after, say, a 7 rotation. In contrast, 
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acting with a Lorentz boost in SO(3,1) will take you further and further away 
from your starting point. The upshot is that lattices in Minkowski space are not 
a good idea. 


The other option is to work with Euclidean spacetime. Here there is no problem 
in writing down a four dimensional lattice that preserves some discrete subgroup 
of SO(4). The flip side is that we have lost the essence of quantum mechanics; 
there is no Hilbert space, and no concept of entanglement. Instead, we have what 
is essentially a statistical mechanics system, with the Euclidean action playing 
the role of the free energy. Nonetheless, we can still compute correlation functions 
and, from this, extract the spectrum of the theory and we may hope that this is 
sufficient for our purposes. 


Throughout this section, we will work with a cubic, four-dimensional Euclidean 
lattice, with lattice spacing a. We introduce four basis vectors, each of unit length. It 
is useful, albeit initially slightly unfamiliar, to denote these as fi, with u = 1,2,3,4. A 
point x in our discrete Euclidean spacetime is then restricted to lie on the lattice I, 
defined by 


4 
P={e:a=) anf, n EZ) (4.1) 


p=1 


The lattice spacing plays the role of the ultra-violet cut-off in our theory 


1 
a = — 
Avv 
For the lattice to be a good approximation, we will need a to be much smaller than 
any other physical length scale in our system. 


Because our system no longer has continuous translational symmetry, we can’t in- 
voke Noether’s theorem to guarantee conservation of energy and momentum. Instead 
we must resort to Bloch’s theorem which guarantees the conservation of “crystal mo- 
mentum”, lying in the Brillouin zone, |k| < m/a. (See, for example, the lectures on 
Applications of Quantum Mechanics.) Umklapp processes are allowed in which the lat- 
tice absorbs momentum, but only in units of 27/a. This means that provided we focus 
on low-momentum processes, k < T/a, we effectively have conservation of momentum 
and energy. 


(An aside: the discussion above was a little quick. Bloch’s theorem is really a 
statement in quantum mechanics in which we have continuous time. It applies directly 
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only in the framework of Hamiltonian lattice gauge theory. In the present context, we 
really mean that the implications of momentum conservation on correlation functions 
will continue to hold in our discrete spacetime lattice, provided that we look at suitably 
small momentum.) 


4.1 Scalar Fields on the Lattice 


To ease our way into the discrete world, we start by considering a real scalar field (x). 
A typical continuum action in Euclidean space takes the form 


Js / de (0,8) iro (4.2) 


Our first task is to construct a discrete version of this, in which the degrees of freedom 
are 


olx) with z ET 


This is straightforward. The kinetic terms are replaced by the finite difference 


plx + aĝ) — p(z) 


a 


plr) — (4.3) 


while the integral over spacetime is replaced by the sum 
f dtr — a‘ ` 
«cel 

Our action (4.2) then becomes 


2 na 2" + viola) 


zer 


As always, this action sits in the path integral, whose measure is now simply a whole 
bunch of ordinary integrals, one for each lattice point: 


Z= MICO e 


zer 


With this machinery, computing correlation functions of any operators reduces to per- 
forming a large but (at least for a lattice of finite size) finite number of integrals. 
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It’s useful to think about the renormalisation group (RG) in this framework. Suppose 
that we start with a potential that takes the form 


(4.4) 


As usual in quantum field theory, må and ào are the “bare” parameters, appropriate for 
physics at the lattice scale. We can follow their fate under RG by performing the kind 
of blocking transformation that was introduced in statistical mechanics by Kadanoff. 
This is a real space RG procedure in which one integrates out the degrees of freedom 
on alternate lattice sites, say all the sites in (4.1) in which one or more n, is odd. 
This then leaves us with a new theory defined on a lattice with spacing 2a. This will 
renormalise the parameters in the action. In particular, the mass term will typically 
shift to 
m? ~ més + ^ 

This is the naturalness issue for scalar fields. If we want to end up with a scalar field 
“nys < 1/a’, then the bare mass must be delicately tuned to be 
of order the cut-of, må ~ —ào/a?, so that it cancels the contribution that arises when 


with physical mass m 


performing RG. This makes it rather difficult in practice to put scalar fields on the 
lattice. As we will see below, life is somewhat easier for gauge fields and, after jumping 
through some hoops, for fermions. 


As usual, RG does not leave the potential in the simple, comfortable form (4.4). 
Instead it will generate all possible terms consistent with the symmetries of the the- 
ory. These include higher terms such as 4° and ¢° in the potential, as well as higher 
derivative terms such as (0,,60“@)?. (Here, and below, we use the derivative notation 
as shorthand for the lattice finite difference (4.3).) This doesn’t bother us because all 
of these terms are irrelevant (in the technical sense) and so don’t affect the low-energy 
physics. 


However, this raises a concern. The discrete rotational symmetry of the lattice is less 
restrictive than the continuous rotational symmetry of R*. This means that RG on the 
lattice will generate some terms involving derivatives 0¢ that would not arise in the 
continuum theory. If these terms are irrelevant then they will not affect the infra-red 
physics and we can sleep soundly, safe in the knowledge that the discrete theory will 
indeed give a good approximation to the continuum theory at low energies. However, if 
any of these new terms are relevant then we’re in trouble: now the low-energy physics 
will not coincide with the continuum theory. 
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So what are the extra terms that arise from RG on the lattice? They must respect 
the Zə symmetry ¢ — —¢ of the original action, which means that they have an even 
number of ¢ fields. They must also respect the discrete rotation group that includes, 
for example, xı —> 22. This rules out lone terms like (0,¢)?. The lowest dimension 
term involving derivatives that respects these symmetries is 


4 


> 5 (0,6)? 


p=1 


But this is, of course, the usual derivative term in the action. The first operator that 
is allowed on the lattice but prohibited in the continuum is 


4 
S paio (4.5) 
p=1 


This has dimension 6, and so is irrelevant. Happily, we learn that the lattice scalar field 
theory differs from the continuum only by irrelevant operators. Provided that we fine 
tune the mass, we expect the long wavelength physics to well approximate a continuum 
theory of a light scalar field. 


4.2 Gauge Fields on the Lattice 


We now come to Yang-Mills. Our task is write down a discrete theory on the lattice 
that reproduces the Yang-Mills action. For concreteness, we will restrict ourselves to 
SU(N) gauge theory, with matter in the fundamental representation. 


As a first guess, it’s tempting to follow the prescription for the scalar field described 
above and introduce four, Lie algebra valued gauge fields A, (x), with u = 1,2,3,4 at 
each point x € I. This, it turns out, is not the right way to proceed. At an operational 
level, it is difficult to implement gauge invariance in such a formalism. But, more 
importantly, this approach completely ignores the essence of the gauge field. It misses 
the idea of holonomy. 


4.2.1 The Wilson Action 


Mathematicians refer to the gauge field as a connection. This hints at the fact that 
the gauge field is a guide, telling the internal, colour degrees of freedom or a particle 
or field how to evolve through parallel transport. The gauge field “connects” these 
internal degrees of freedom at one point in space to those in another. 
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We saw this idea earlier in Section 2 after introducing the Yang-Mills field. (See Sec- 
tion 2.1.3.) Consider a test particle which carries an internal vector degree of freedom 
wi, with? = 1,..., N. As the particle moves along a path C, from x; to xy, this vector 
will evolve through parallel transport 


w(t) = U [xi x]w(%) 


where the holonomy, or Wilson line, is given by the path ordered exponential 


Teepe ( f a) (4.6) 


Note that U[x;,xs| depends both on the end points, and on the choice of path C. 


This is the key idea that we will implement on the lattice. We will not treat the Lie- 
algebra valued gauge fields A,, as the fundamental objects. Instead we will work with 
the group-valued Wilson lines U. These Wilson lines are as much about the journey as 
the destination: their role is to tell other fields how to evolve. The matter fields live 
on the sites of the lattice. In contrast, the Wilson lines live on the links. 


Specifically, on the link from lattice site x to x + fi, we will introduce a dynamical 
variable 


linkz —>xs+û: Usge)eG 


The fact that the fundamental degrees of freedom are group valued, rather than Lie 
algebra valued, plays an important role in lattice gauge theory. It means, for example 
that there is an immediate difference between, say, SU(N) and SU(N)/Zy, a distinc- 
tion that was rather harder to see in the continuum. We will see other benefits of this 
below. 


At times we will wish to compare our lattice gauge theory with the more familiar 
continuum action. To do this, we need to re-introduce the A, gauge fields. These are 
related to the lattice degrees of freedom by 


Laser (4.7) 


The placing of the u subscripts on the left and right hand side of this equation should 
make you feel queasy. It looks bad because if one side transforms covariantly un- 
der SO(4) rotations, then the other does not. But we don’t want these variables to 
transform under continuous symmetries; only discrete ones. This is the source of your 
discomfort. 


— 205 — 


We will wish to identify configurations related by gauge transformations. In the 
continuum, under a gauge transformation Q(x), the Wilson line (4.6) transforms as 


Uai, xf; C] > (a4) U [zi vp; C| A (zs) 
We can directly translate this into our lattice. The link variable transforms as 
U(E) > Ue) U, (2) Ae + f) (4.8) 


The next step is to write down an action that is invariant under gauge transformations. 
We can achieve this by multiplying together a string of neighbouring Wilson lines, and 
then taking the trace. With no dangling ends, this is guaranteed to be gauge invariant. 
This is the lattice version of the Wilson loop (2.15) that we met in Section 2. 


We can construct a Wilson loop for any closed path C in the xt+v x+u+V 
lattice. When the path goes from the site x to x + fi, we include a 
factor of U (x); when the path goes from site x to site x — fu, we 
include a factor of U A(z + ji). The simplest such path is a square 
which traverses a single plaquette of the lattice as shown in the 
figure. The corresponding Wilson loop is x x+u 


Wo = trU,,(x) U (£ + â) Ut (x + 0) US (2) Figure 34: 


To get some intuition for this object, we can write it in terms of the gauge field (4.7). We 
will assume that we can Taylor expand the gauge field so that, for example, A (r+ Â) ~ 
A, (£) + ad,,A,(a) +.... Then we have 


W- a tr et@Au (2) ettlr (2) +40, Ar (2) eiel (2) +40, Ay (2) ear (x) 


ia(Ap(x)+Ay (2)+a0p Av (x) +F [Ap (2), An (2)]) 6—ta(Av (x) + Ap (x) +00, Ap())—F [Ap (x), Av(2)]) 


x tre e 


where, to go to the second line, we’ve used the BCH formula e4e2 = e4+8+3/4.8)+-~, 


On both lines we’ve thrown away terms of order a® in the exponent. Using BCH just 
once more, we have 


4 
W. —tr cia” Fur (a) +... = tr (1 -+ iœ? Fu — 5 FP +.. ] 


a2 
= Zg! FE + rae 


where, as usual, F(x) = 0,A,(x) —O0,A,(x) —i[A, (x), A (x)] and the ... include both 
a constant term and terms higher in order in a?. Note that there is no sum over 1, V 
in this expression; instead these u,v indices tell us of the orientation of the plaquette. 
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By summing over all possible plaquettes, we get something that reproduces the Yang- 


Mills action at leading order. The Wilson loop Wg itself is not real so we need to add 
the conjugate Wi, which is the loop with the opposite orientation. This then gives us 
the Wilson action 


os £ t _ úp 4 
Swilson = Fm + Wi) TE | Ot tt BaF +. (4.9) 


where now we are again using summation convention for u, v indices. An extra factor of 
1/2 has appeared because the sum over plaquettes differs by a factor of 2 from the sum 
over u,v. It is convention to put a factor of 1/N in front of the action. The coupling 
b is related to the continuum Yang-Mills coupling (2.8) by 


6 1 
2N g 
The Wilson action only coincides with the Yang-Mills action at leading order. Expand- 
ing to higher orders in a will give corrections. The next lowest dimension operator 
to appear is Fw D Fyw: It has dimension 6 and does not correspond to an operator 
that respects continuous O(4) rotational symmetry. In this way, it is analogous to the 


operator (4.5) that we saw for the scalar field. Happily, it is irrelevant. 


The Wilson action is far from unique. For example, we could have chosen to sum over 


double plaquettes LL] as opposed to single plaquettes. Expanding these, or any such 
Wilson loop, will result in a F,,F),, term simply because this is the lowest dimension, 
gauge invariant operator. These Wilson loops differ in the relative coefficients of the 
expansion. 


For numerical purposes, this lack of uniqueness can be exploited. We could augment 
the Wilson action with additional terms corresponding to double, or larger, plaquettes. 
This can be done in such a way that the Yang-Mills action survives, but the higher 
dimension operators, such as F, wD? Fyw cancel. This means that the leading higher 
derivative terms are even more irrelevant, and helps with numerical convergence. We 
won’t pursue this (or, indeed, any numerics) here. 


Adding Dynamical Matter 


As we mentioned before, matter fields live on the sites of the lattice. Consider a 
scalar field (x) transforming in the fundamental representation of the gauge group. 
(Fermions will come with their own issues, which we discuss in Section 4.3.) Under a 
gauge transformation we have 


plz) > Q(x) e(@) 
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We can now construct gauge invariant objects by topping and tailing the Wilson line 
with particle and anti-particle matter insertions. The simplest example has the particle 
and anti-particle separated by just one lattice spacing, ¢'(x)U,,(x)o(x + ft). More 
generally, we can separate the two as much as we like, as the long as the Wilson line 
forges a continuous path between them. 


To write down a kinetic term for this scalar, we need the covariant version of the 
finite difference (4.3). This is given by 


jes [D.o(x)? — a $ Bogle) — d(@)U, lola + â) — (e + AU} (2) d(@)] 


(x,u) 


In this way, it is straightforward to coupled scalar matter to gauge fields. We won’t have 
anything more to say about dynamical matter here, but we’ll return to the question in 
Section 4.3 when we discuss fermions on the lattice. 


4.2.2 The Haar Measure 


To define a quantum field theory, it’s not enough to give the action. We also need to 
specify the measure of the path integral. 


Of course, usually in quantum field theory we’re fairly lax about this, and the measure 
certainly isn’t defined at the level of rigour that would satisfy a mathematician. The 
lattice provides us an opportunity to do better, since we have reduced the path integral 
to a large number of ordinary integrals. For lattice gauge theory, the appropriate 
measure is something like 


I] dU,,(x) (4.10) 
(x, ft) 
so that we integrate over the U € G degree of freedom on each link. The question is: 
what does this mean? 


Thankfully this is a question that is well understood. We want to define an inte- 
gration measure over the group manifold G. We will ask that the measure obeys the 
following requirements: 


e Left and right invariance. This means that for any function f(U), with U € G, 
and for any Q € G, 


fw fU) = pu f(QU) = fw f(UQ) (4.11) 
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This will ensure that our path integral respects the gauge symmetry (4.8). By a 
change of variables, this is equivalent to the requirement that d(UQ) = d(QU) = 
dU for all Q € G. 


e Linearity: 


[ee (af) +89()) =a fav pu) +8 f av ov) 
This is something that we take for granted in integration, and we would very 


much like to retain it here. 


e Normalisation condition: 
f dU 1i=1 (4.12) 


A difference between gauge theory on the lattice and in the continuum is that the 
dynamical degrees of freedom live in the group G, rather than its Lie algebra. The 
group manifold is compact, so that f dU 1 just gives the volume of G. There’s 
no real meaning to this volume, so we choose to normalise it it to unity. 


It turns out that there is a unique measure with these properties. It is known as the 
Haar measure. 


We won't need to explicitly construct the Haar measure in what follows, because 
the properties above are sufficient to calculate what we’ll need. Nonetheless, it may 
be useful to give a sense of where it comes from. We start in a neighbourhood of the 
identity. Here we can write any SU(N) group element as 


U = eiT" 


with T° the generators of the su( N) algebra. In this neighbourhood, the Haar measure 
becomes (up to normalisation) 


I dU = f d™°-ta \/det y (4.13) 


where y is the canonical metric on the group manifold, 


OU yai a 


Oat dab 


Yab = tr (v= 


This measure is both left and right invariant in the sense of (4.11), since the group 
action corresponds to shifting af — a! + constant. 
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Now suppose that we want to construct the measure in the neighbourhood of any 
other point, say Up. We can do this by using the group multiplication to transport the 
neighbourhood around the identity to a corresponding neighbourhood around Up. In 
this way, we can construct the measure over various patches of the group manifold. 


One way to transport the measure from one neighbourhood to another is by right 
multiplication. We write 


Serr U (4.14) 


We then again use the definition (4.13) to define the measure. This measure is left 
invariant, satisfying dU = d(QU) since multiplying U on the left by Q corresponds to 
shifting af — a* + constant. In fact, this is the unique left invariant measure. 


But is the measure right invariant? If we multiply U on the right then the group 
element Q must make its way past Up before we can conclude that it shifts a° by a 
constant. But Q and Up do not necessarily commute. Nonetheless, the measure is 
right invariant. This follows from the fact that we have constructed the unique left 
invariant measure which means that, if we consider the measure d(UQ), which is also 
left invariant then, by uniqueness, it must be the same as the original. So d(UQ) = dU. 


Integrating over the Group 


In what follows, we will need results for some of the simpler integrations. 


We start by computing the integral f dU U. Because the measure is both left and 
right invariant, we must have 


fwv- fav nue, 
for any Qı and Qs € G. But there’s only one way to achieve this, which is 
fw U=0 (4.15) 


More generally, we will only get a non-vanishing answer if we integrate objects which 
are invariant under G. This will prove to be a powerful constraint, and we’ll discuss it 
further below. 
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The simplest, non-trivial integral is therefore f dU ULU, kl, Where we’ve included the 
gauge group indices 7,7 = 1,...,N. This must be proportional to an invariant tensor, 
and the only option is 


eee (4.16) 


pu U) Un = 751 


To see that the 1/N factor is correct, we can contract the jk indices and reproduce 
the normalisation condition (4.12). One further useful integral comes from the baryon 
vertex, which gives 


1 
fw Uiz Re Uia = N! Eii..iN Eji.jN 


Elitzur’s Theorem 


Let’s now return to our lattice gauge theory. We wish to compute expectation values 
of operators O by computing 


(0) = zl I] dU„(x) © e7 S Wilson 


(x,u) 


This is simply lots of copies of the group integration defined above. The fact that any 
object which transforms under G' necessarily vanishes when integrated over the group 
manifold has an important consequence for our gauge theory: it ensures that we have 


(0) =0 


for any operator O that is not gauge invariant. This is known as Elitzur’s theorem. 
Note that this statement has nothing to do with confinement. It is just as valid for 
electromagnetism as for Yang-Mills, and is a statement about the operators we should 
be considering in a gauge theory. 


Elitzur’s theorem follows in a straightforward manner from (4.15). To illustrate the 
basic idea, we will show how it works for a link variable, O = U,(y). We want to 
compute 


(Ui(y)) = > f [| Up) L(y) e75 
(zu) 


The specific link variable U, (y) will appear in a bunch of different plaquettes that arise 
in the Wilson action. For example, we could focus on the plaquette Wilson loop 


Wao = tr Uslu) Up(y + 0) Ul (y + p) Uh (y) (4.17) 
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But we know that the measure is invariant under group multiplication of any link 
variable. We can therefore make the change of variable 


Up(y) > US(y)Up(y) (4.18) 


in which case the particular plaquette Wilson loop (4.17) becomes 


Wo — trU,(y+0)US(y + 6) Uy) 


and is independent of U,(y). You might think that this will è 
screw up some other plaquette action, where U, (y) will reappear. y+p 
There are 8 links emanating from the site y, as shown in the 


disappointingly 3d figure on the right. You can convince yourself y ytv 
that if you make the same change of variables (4.18) for each of 


them then Swilson no longer depends on the specific link variable 
U.(y). We can then isolate the integral over the link variable 
U,(y), to get Figure 35: 


(U,(y)) = other stuff x few) U,(g) =0 


which, as shown, vanishes courtesy of (4.15). This tells us that a single link variable 
cannot play the role of an order parameter in lattice gauge theory. But this is something 
we expected from our discussion in the continuum. 


We see that the Wilson action is rather clever. It’s constructed from link variables 
U.(y), but doesn’t actually depend on them individually. Instead, it depends only on 
gauge invariant quantities that we can construct from the link variables. These are the 
Wilson loops. 


A Comment on Gauge Fixing 


The integration measure (4.10) will greatly overcount physical degrees of freedom: it 
will integrate over many configurations all of which are identified by gauge transfor- 
mations. What do we do about this? The rather wonderful answer is: nothing at 
all. 


In the continuum, we bend over backwards worrying about gauge fixing. This is 
because we are integrating over the Lie algebra and will get a divergence unless we 
fix a gauge. But there is no such divergence in the lattice formulation because we are 
integrating over the compact group G. Instead, the result of failing to fix the gauge 
will simply be a harmless normalisation constant. 
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4.2.3 The Strong Coupling Expansion 
We now have all the machinery to define the partition function of lattice gauge theory, 
Z= f I] dU, (x) e79 Wison (4.19) 
(x) 


with Switson the sum over plaquette Wilson loops, 


Swilson = = ` (w. F wi) (4.20) 


Because we’re in Euclidean spacetime, the parameter 6 plays the same role as the in- 
verse temperature in statistical mechanics. It is related to the bare Yang-Mills coupling 
as 6 = 2N/ 9°. 


We expect this theory to give a good approximation to continuum Yang-Mills when 
the lattice spacing a is suitably small. Here “small” is relative to the dynamically 
generated scale Agcp. Thinking of 1/a as the UV cut-off of the theory, the physical 
scale is defined by 


1 
Aecp = see (4.21) 


where 6o is the one-loop beta-function which, despite the unfortunate similarity in their 
names, has nothing to do with the lattice coupling 6 that we introduced in the Wilson 
action. We calculated the one-loop beta function in Section 2.4 and, importantly, 
Bo <0. 


2 


In the expression (4.21), g* is the bare gauge coupling. We see that we have a 


separation of scales between Agcp and the cut-off provided our theory is weakly coupled 
in the UV, 


2x1 & ß>l 


In this case, we expect the lattice gauge theory to closely match the continuum. We 
only have to do some integrals. Lots of integrals. I can’t do them. You probably can’t 
either. But a computer can. 


We could also ask: what happens in the opposite regime, namely 
gx>ls g<l1 


It’s not obvious that this regime is of interest. From (4.21), we see that there is 
no separation between the physical scale, Agcp, and the cut-off scale 1/a, so this is 


= 215 = 


unlikely to give us quantitative insight into continuum Yang-Mills. Nonetheless, it does 
have one thing going for it: we can actually calculate in this regime! We do this by 
expanding the partition function (4.19) in powers of 6. This is usually referred to as 
the strong coupling expansion; it is analogous to the high temperature expansion in 
statistical lattice models. (See the lectures on Statistical Physics for more details of 
how this works in the Ising model.) 


Confinement 


We'll use the strong coupling expansion to compute the expec- L 
tation value of a large rectangular Wilson loop, W [C], 


wicl=<o PTL ule) (4.22) 


(z,u)EC 
Here the factor of 1/N is chosen so that if all the links are 
U = 1 then W[C] = 1. We’ll place this loop in a plane of the 
lattice as shown in the figure, and give the sides length L and 
T. (Each of these must be an integer multiple of a.) 


Figure 36: 


We would like to calculate 


(W|C]) = zÍ I] dU, (x) WC] e7 9 Wilson 
(zu) 


In the strong coupling expansion, we achieve this by expanding e~°Wis» in powers of 
B & 1. What is the first power of 6 that will give a non-zero answer? If a given 
link variable U appears in the integrand just once then, as we’ve seen in (4.15), it will 
integrate to zero. This means, for example, that the 6° term in the expansion of e~ SWilso 
will not contribute, since it leaves the each of the links in W[C] unaccompanied. 


The first term in the expansion of e~%Wis» that will give a non-vanishing answer 
must contribute a Ut for each link in C. But any Ut that appears in the expansion 
of Swilson Must be part of a plaquette of links. The further links in these plaquettes 
must also have companions, and these come from further plaquettes. It is best to 
think graphically. The links U of the Wilson loop are shown in red. They must be 
compensated by a corresponding Ut from Switson plaquettes; these are shown in blue in 
the next figure. The simplest way to make sure that no link is left behind is to tile a 
surface bounded by C by plaquettes. We have shown some of these tiles in the figure. 


Note that each of the plaquettes Wg must have a particular orientation to cancel the 
Wilson loop on the boundary; this orientation then dictates the way further tiles are 
laid. 
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There are many different surfaces S that we could use to 


tile the interior of C. The simplest is the one that lies in the 


same plane as C and covers each lattice plaquette exactly once. 
However, there are other surfaces, including those that do not lie 
in the plane. We can compute the contribution to (W[C]) from 
any given surface S. Only the plaquettes of a specific orientation 
in the Wilson action (4.20) will contribute (e.g. Wp, but not 
Wi). The beta dependence is therefore 


B # of plaquettes Figure 37: 
2N 


Each link in the surface (including those in the original C) will give rise to an integral 
of the form (4.16). This then gives a term of the form 


1\# of links 
(x) 


Finally, for every site on the surface (including those on the original C), we'll be left 
with a summation 0;;0;; = N. This gives a factor of 


N# of sites 


Including the overall factor of 1/N in the normalisation of the Wilson loop (4.22), we 
have the contribution to the Wilson loop 


# of plaquettes # of links f 
wich = = (Fr) (=) N# of sites 


N \2N N 


where we’ve used the fact that Z = 1 at leading order in 6. This is the answer for a 
general surface. The leading order contribution comes from the minimal, flat surface 
which bounds C which has 


RT 
# of plaquettes = —~ 
a 


and 
# vertical links = aut and ¥# horizontal links = aa 
and 
tree (R+1)(T +1) 


a2 
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The upshot is that the leading order contribution to the Wilson loop is 


B RT /a? 1 
wich=(sa5) mem 


But this is exactly what we expect from a confining theory: it is the long sought area 
law (2.75) for the Wilson loop, 
1 —oA 
(WIC]) = za e 


where A = RT is the area of the minimal surface bounded by C and the string tension 


1 B 
da | oa 


At the next order, this will get corrections of O(3). Note that the string tension is of 


a is given by 


order the UV cut-off 1/a, which reminds us that we are not working in a physically 
interesting regime. Nonetheless we have demonstrated, for the first time, the promised 
area law of Yang-Mills, the diagnostic for confinement. 


A particularly jarring way to illustrate that we’re not computing in the continuum 
limit is to note that the computation above makes no use of the non-Abelian nature 
of the gauge group. We could repeat everything for Maxwell theory, in which the link 
variables are U € U(1). Nothing changes. We again find an area law in the strong 
coupling regime, indicating the existence of a confining phase. 


What are we to make of this? For U(1) gauge theory, there clearly must be a phase 
transition as we vary the coupling from 6 < 1 to O > 1 where we have the free, 
continuum Maxwell theory that we know and love. But what about Yang-Mills? We 
may hope that there is no phase transition for non-Abelian gauge groups G, so that 
the confining phase persists for all values of 8. It seems that this hope is likely to be 
dashed. At least as far as the string tension is concerned, it appears that there is a 
finite radius of convergence around 8 = 0, and the string tension exhibits an essential 
singularity at a finite value of @. It is not known if there is a different path — say by 
choosing a different lattice action — which avoids this phase transition. 


The Mass Gap 


We can also look for the existence of a mass gap in the strong coupling expansion. 
Since we’re in Euclidean space, we have neither Hilbert space nor Hamiltonian so we 
can’t talk directly about the spectrum. However, we can look at correlation functions 
between two far separated objects. 
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The objects that we have to hand are the Wilson loops. We take two, parallel 


plaquette Wilson loops Wo and W; 


, separated along a lattice axis by distance R. We 


expect the correlation function of these Wilson loops to scale as 


(W, 


Wo) ~ e™? (4.23) 


with m the mass of the lightest excitation. If the theory turns out to be gapless, we 


will instead find power-law decay. 


We can compute this correlation function in the strong coupling expansion. The 


argument is the same as that above: to get a non-zero answer, we must form a tube of 


plaquettes. The minimum such tube is depicted in the figure, with the source Wilson 


loops shown in red, and the tiling from the action shown in blue. (This time we have 


not shown the orientation of the Wilson loops to keep the figure uncluttered.) It has 


4R 
# of plaquettes = — 
a 


# links = 


# sites = 


4(2R + 1) 
a 


4(R+1) Figure 38: 


The leading order contribution to the correlation function is therefore 


(Wi 


Wi 


B 4R/a 
I= (z) 


Comparing to the expected form (4.23), we see that we have a mass gap 


ie 
ee a8 \ ON? 


Once again, it’s comforting to see the expected behaviour of Yang-Mills. Once again, 


we see the lack of physical realism highlighted in the fact that the mass scale is the 


same order of magnitude as the UV cut-off 1/a. 
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4.3 Fermions on the Lattice 


Finally we turn to fermions. Here things are not so straightforward. The reason is 
simple: anomalies. 


Even before we attempt any calculations, we can anticipate that things might be 
tricky. Lattice gauge theory is a regulated version of quantum field theory. If we work 
on a finite, but arbitrarily large lattice, we have a finite number of degrees of freedom. 
This means that we are back in the realm of quantum mechanics. There is no room for 
the subtleties associated to the chiral anomaly. There is no infinite availability at the 
Hilbert hotel. 


This means that we’re likely to run into trouble if we try to implement chiral sym- 
metry on the lattice or, at the very least, if we attempt to couple gapless fermions to 
gauge fields. We might expect even more trouble if we attempt to put chiral gauge 
theories on the lattice. In this section, we will see the form that this trouble takes. 


4.3.1 Fermions in Two Dimensions 


We can build some intuition for the problems ahead by looking at fermions ind = 1+1 
dimensions. Here, Dirac spinors are two-component objects. We work with the gamma 
matrices 


0 1 


yao , Yai , Pa—y =o? 


The Dirac fermion then decomposes into chiral fermions y+ as 


A 


In the continuum, the action for a massless fermion is 


S= | ès boo = fea ipt ð yp, +iyt ðY (4.24) 


with 0+ = 0; + O,. The equations of motion tell us 0_w, = 0,y~_ = 0. This means 
that w+ is a left-moving fermion, while ~w_ is a right-moving fermion. 


As in Section 3.1, it is useful to think in the language of the Dirac sea. The dispersion 
relation E(k) for fermions in the continuum is drawn in the left hand figure. All states 
with E < 0 are to be thought of as filled; all states with Æ > 0 are empty. 
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E(k) E(k) 


Figure 39: The dispersion relation fora Figure 40: A possible deformation to 
Dirac fermion in the continuum keep the dispersion periodic in the Bril- 
louin zone (with a = 1). 


The (blue) line with positive gradient describes the excitations of the right-moving 
fermion w_: the particles have momentum k > 0 while the filled states have momentum 
k < 0 which means that the anti-particles (a.k.a holes) again have momentum k > 0. 
Similarly, the (orange) line with negative gradient describes the excitations of the left- 
moving fermion Y4. 


The chiral symmetry of the action (4.24) means that the left- and right-handed 
fermions are individually conserved. As we have seen Section 3.1, this is no longer the 
case in the presence of gauge fields. But, for now, we will consider only free fermions so 
the chiral symmetry remains a good symmetry, albeit one that has a ’t Hooft anomaly. 


So much for the continuum. What happens if we introduce a lattice? We will start 
by keeping time continuous, but making space discrete with lattice spacing a. This is 
familiar from condensed matter physics, and we know what happens: the momentum 
takes values in the Brillouin zone 


Importantly, the Brillouin zone is periodic. The momentum k = +7/a is identified 
with the momentum k = —7/a. 


What does this mean for the dispersion relation? We’ll look at some concrete models 
shortly, but first let’s entertain a few possibilities. We require that the dispersion 
relation E(k) remains a continuous, smooth function, but now with k € S! rather than 
k € R. This means that the dispersion relation must be deformed in some way. 


One obvious possibility is shown in the right hand figure above: we deform the shape 
of the dispersion relation so that it is horizontal at the boundary of the Brillouin zone 
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E(k) E(k) 


1.0- 


-1.0- 


Figure 41: The dispersion relation fora Figure 42: A possible deformation to 
right-handed fermion in the continuum keep the dispersion periodic in the Bril- 
louin zone (with a = 1). 


|| = a/a. We then identify the states at k = +7/a. Although this seems rather mild, 
it’s done something drastic to the chiral symmetry. If we take, say, a right-moving 
excitation with k > 0 and accelerate it, it will eventually circle the Brillouin zone and 
come back as a left-moving excitation. This is shown graphically by the fact that the 
blue line connects to the orange line at the edge of the Brillouin zone. (This is similar 
to the phenomenon of Bloch oscillations observed in cold atom systems; see the lectures 
on Applications of Quantum Mechanics.) Said another way, to get such a dispersion 
relation we must include an interaction term between w, and w_. This means that, 
even without introducing gauge fields, there is no separate conservation of left and 
right-moving particles: we have destroyed the chiral symmetry. Note, however, that 
we have to excite particles to the maximum energy to see violation of chiral symmetry, 
so it presumably survives at low energies. 


Suppose that we insist that we wish to preserve chiral symmetry. In fact, suppose 
that we try to be bolder and put just a single right-moving fermion Y, on a lattice. 
We know that the dispersion relation E(k) crosses the Æ = 0 axis at k = 0, with 
dE/dk > 0. But now there’s no other line that it can join. The only option is that 
the dispersion relation also crosses the Æ = 0 at some other point k 4 0, now with 
dE/dk < 0. An example is shown in right hand figure above. Now the lattice has 
an even more dramatic effect: it generates another low energy excitation, this time a 
left-mover. We learn that we don’t have a theory of a chiral fermion at all: instead 
we have a theory of two Weyl fermions of opposite chirality. Moreover, once again 
a right-moving excitation can evolve continuously into a left-moving excitation. This 
phenomenon is known as fermion doubling. 


You might think that you can simply ignore the high momentum fermion. And, of 
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course, in a free theory you essentially can. But as soon as we turn on interactions — 
for example, by adding gauge fields — these new fermions can be pair produced just 
as easily as the original fermions. This is how the lattice avoids the gauge anomaly: it 
creates new fermion species! 


More generally, it is clear that the Brillouin zone must house as many gapless left- 
moving fermions as right-moving fermions. This is for a simple reason: what goes up, 
must come down. This is a precursor to the Nielsen-Ninomiya theorem that we will 
discuss in Section 4.3.3 


Quantising a Chiral Fermion 


Let’s now see how things play out if we proceed in the obvious fashion. The Hamiltonian 
for a chiral fermion on a line is 


H=+ | da ilaw 


The form of the Hamiltonian is the same for both chiralities; only the + sign out front 
determines whether the particle is left- or right-moving. As we will see below, the 
requirement that the Hamiltonian is positive definite will ultimately translate this sign 
into a choice of vacuum state above which all excitations move in a particular direction. 


For concreteness, we’ll work with right-moving fermions 7_. We discretise this system 
in the obvious way: we consider a one-dimensional lattice with sites at x = na, where 
n € Z, and take the Hamiltonian to be 


H =—a X ipl (a) a 


The Hamiltonian is Hermitian as required. We introduce the usual momentum expan- 
sion 


+r/a dk. 7 
E [Bae 


Note that we have momentum modes for both k > 0 and k < 0, even though this is a 
purely right-moving fermion. Inserting the mode expansion into the Hamiltonian gives 


{fae 
= — — 2Qsin(ka) clc, 
2a Jinja 27 
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From this we can extract the one-particle dispersion relation by constructing the state 
lk} = cl |0}, to find the energy H|k) = E(k)|k), with 


E(k) = É sin(ka) 


This gives a dispersion relation of the kind we anticipated above: it has zeros at both 
k = 0 and at the edge of the Brillouin zone k = m/a. As promised, we started with a 
right-moving fermion but the lattice has birthed a left-moving partner. 


Finally, a quick comment on the existence of states with k < 0. The true vacuum is 
not |0), but rather |Q} which has all states with Æ < 0 filled. This is the Dirac sea or, 
Fermi sea since the number of such states are finite. This vacuum obeys c,|Q) = 0 for 
k > 0 and c!|Q) = 0 for k < 0. In this way, cl creates a right-moving particle when 
k > 0, and cx creates a right-moving anti-particle with momentum |k| when k < 0. 


4.3.2 Fermions in Four Dimensions 


A very similar story plays out in d = 3+ 1 dimensions. A Weyl fermion y+ is a 


2-component complex spinor and obeys the equation of motion 


Ons =o" iW 


The Hamiltonian for a single Weyl fermion takes the form 


H = + fèr iplo 


Once again, we wish to write down a discrete version of this Hamiltonian on a cubic 
spatial lattice [. For concreteness, we’ll work with w~_. We take the Hamiltonian to be 


n= Sate Fv [ese 


2a 
xer i=1,2,3 


where 2 = 1,2,3 labels the spatial directions. In momentum space, the spinor is 


dk ex 
P-D faa A 


where c, is again a two-component spinor. Here the momentum is integrated over the 


Brillouin zone 


ee 


The Hamiltonian now takes the form 


“ip @k 
E 2a BZ (27)* 


yD 2 sin(k;a) clo’ ck (4.25) 


i=1,2,3 


If we focus on single particle excitations, the spectrum now has two bands, correspond- 
ing to a particle and anti-particle, and is given by 


1 : 
E(k) =- iÐ sin (k;a) o° 
a 
i=1,2,3 


Close to the origin, k < 1/a, the Hamiltonian looks like that of the continuum fermion, 
with dispersion 


ja Ek) 
E(k) ~k-o (4.26) 


This is referred to as the Dirac cone; it is sketched in the k 
figure. Note that the bands cross precisely at Æ = 0 which, 


in a relativistic theory, plays the role of the Fermi energy. If 
the dispersion relation were to cross anywhere else, we would 


have a Fermi surface. 


The fact that the Dirac cone corresponds to a right-handed Figure 43: 
fermion Y— shows up only in the overall + sign of the Hamil- 
tonian. A left-handed fermion would have a minus sign in front. In fact, our full lattice 
Hamiltonian (4.25) has both right- and left-handed fermions since, like the d = 1 + 1 
example above, it exhibits fermion doubling. There are gapless modes at momentum 


keta 
a 


This gives 2? = 8 gapless fermions in total. If we expand the dispersion relation around, 
say kı = (7/a,0,0), it looks like 


E(k’) x-k -o where k’ = k—-k, 


which is left-handed. Of the 8 gapless modes, you can check that 4 are right-handed 
and 4 are left-handed. We see that, once again, the lattice has generated new gapless 
modes. Anything to avoid that anomaly. 
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4.3.3 The Nielsen-Ninomiya Theorem 


We saw above that a naive attempt to quantise a d = 3 + 1 chiral fermion gives equal 
numbers of left and right-handed fermions in the Brillouin zone. The Nielsen-Ninomiya 
theorem is the statement that, given certain assumptions, this is always going to be 
the case. It is the higher dimensional version of “what goes up must come down”. 


The Nielsen-Ninomiya theorem applies to free fermions. We will work in terms of 
the one-particle dispersion relation, rather than the many-body Hamiltonian. To begin 
with, we consider a dispersion relation for a single Weyl fermion (we will generalise 
shortly). In momentum space, the most general Hamiltonian is given by 


H = ;(k)o’ + e(k) 12 (4.27) 
where k takes values in the Brillouin zone. 


In the language of condensed matter physics, this Hamiltonian has two bands, cor- 
responding to the fact that each term is a 2 x 2 matrix. The first question that we will 
ask is: when do the two bands touch? This occurs when each v;(k) = 0 for i = 1, 2,3. 
This is three conditions, and so we expect to generically find solutions at points, rather 
than lines, in the Brillouin zone BZ C R. Let us suppose that there are D such points, 
which we call k,, 


Expanding about any such point, the dispersion relation becomes 


Ov; 


Awe viz (ka) (k = k,)’o" with Vij = Oki 


This now takes a similar form to (4.26), but with an anisotropic dispersion relation. 
The chirality of the fermion is dictated by 


chirality = sign det v;; (ka) (4.28) 


The assumption that the band crossing occurs only at points means that det v;;(ka) 4 0. 
The Nielsen-Ninomiya theorem is the statement that, for any dispersion (4.27) in a 
Brillouin zone, there are equal numbers of left- and right-handed fermions. 


We offer two proofs of this statement. The first follows from some simple topological 
considerations. For k # ka, we can define a unit vector 
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The key idea is that this unit vector can wind around each of the degenerate points ky. 
To see this, surround each such point with a sphere S2. Evaluated on these spheres, ¥ 
provides a map 


7: S28? 


But we know that such maps are characterised by Ia(S?) = Z. Generically, this winding 
will take values +1 only. In non-generic cases, where we have, say, winding +2, we can 
perturb the v slightly and the offending degenerate point will split into two points each 
with winding +1. This is the situation we will deal with. 


This winding {+1,—1} C I2(S?) is precisely the chirality (4.28). One, quick argu- 
ment for this is the a spatial inversion will flip both the winding and the sign of the 
determinant. 


To finish the argument, we need to show that the total winding must vanish. This 
follows from the compactness of the Brillouin zone. Here are some words. We could 


consider a sphere S which encompasses more and more degenerate points. The 


2 
bigger 
winding of around this sphere is equal to the sum of the windings of the S? which sit 


inside it. By the time we get to a sphere S , which encompasses all the points, we 


2 
bigges 
can use the compactness of the Brillouin zone to contract the sphere back onto itself 
on the other side. The winding around this sphere must, therefore, vanish. 


Here are some corresponding equations. The winding number va is given by 


1 ak od” 06° 
Ya = dore et =l 
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We saw this expression previously in (2.89) when discussing ’t Hooft-Polyakov monopoles. 
Let us define BZ’ as the Brillouin zone with the balls inside SŽ excised. This means 
that the boundary of BZ’ is 


D 
0(BZ') = 5°82 
a=1 


Note that this is where we’ve used the compactness of the Brillouin zone: there is no 
contribution to the boundary from infinity. We can then use Stokes’ theorem to write 


D 
1 as. 0 fy 06° 06° 
a = — Ëk ijk „abc ^a 
2i arda OK ( e Y ak a) 
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But the bulk integrand is strictly zero, 


o i Ob? ð: P Ob? 06° O° 
ck abexa = cidk abe 


Oki Oks Okk 


Ok? Oki Ok* 


because each of the three vectors 06%/Ok", i = 1,2,3 is orthogonal to 6% and so all three 
lie must in the same plane. This tells us that 


as promised. 


Note that the Nielsen-Ninomiya theorem counts only the points of degeneracy in the 
dispersion relation (4.27): it makes no comment about the energy e(ka) of these points. 
To get relativistic physics in the continuum, we require that e(ka) = 0. This ensures 
that the bands cross precisely at the top of the Dirac sea, and there is no Fermi surface. 
This isn’t as finely tuned as it appears and arises naturally if there is one electron per 
unit cell; we saw an example of this phenomenon in the lectures on Applications of 
Quantum Field Theory when we discussed graphene. 


Another Proof of Nielsen-Ninomiya: Berry Phase 


There is another viewpoint on the Nielsen-Ninomiya theorem that is useful. This places 
the focus on the Hilbert space of states, rather than the dispersion relation itself’. 


For each k € BZ, there are two states. As long as k Æ ka, these have different 
energies. In the language of the Dirac sea, the one with lower energy is filled and the 
one with higher energy is empty. We focus on the lower energy, filled states which we 
refer to as |w(k)), k # ka. The Berry connection is a natural U(1) connection on these 
filled states, which tells us how to relate their phases for different values of k, 


Ail) = iC 25 W) 


You can find a detailed discussion of Berry phase in both the lectures on Applications 
of Quantum Field Theory and the lectures on Quantum Hall Effect. From the Berry 
phase, we can define the Berry curvature 


DA; _ OA 
On” ORF 


Jij = 


This is closely related to the Nobel winning TKKN formula that we discussed the lectures on the 
Quantum Hall Effect. 
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The Berry curvature for the dispersion relation (4.27) is the simplest example that 
we met when we first came across the Berry phase and is discussed in detail in both 
previous lectures. The chirality of the gapless fermion can now be expressed in terms of 
the curvature F, which has the property that, when integrated around any degenerate 
point k,, 


1 
Vay = — F=#l 
Qn s2 


Now we complete the argument in the same way as before. We have 


1 1 
Qn tee eee Pe 


Again, we learn that there are equal numbers of left- and right-handed fermions. 


We can extend this proof to systems with multiple bands. Suppose that we have 
a system with q bands, of which p are filled. This state of affairs persists apart from 
at points k, where the p band intersects the (p + 1). Away from these points, we 
denote the filled states as |w.(k)) with a = 1,...p. These states then define a U(p) 
Berry connection 

o 
A; a =i al Ar; 
(Aids UVa Dk 

and the associated U(p) field strength 


O(A )ab O(Ai)ab 2 
(Fule = Bs o 
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This time the winding is 


1 
Ve = — tir 
2T s2 


The same argument as above tells us that, again, }>, va = 0. 


4.3.4 Approaches to Lattice QCD 


So far our discussion of fermions has been in the Hamiltonian formulation, where time 
remains continuous. The issues that we met above do not disappear when we consider 
discrete, Euclidean spacetime. For example, the action for a single massless Dirac 
fermion is 


S= fe ipy ðu 
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The obvious discrete generalisation is 


S = af y ip(x) Noy £ F on) va — aĝ) (4.29) 


cel m 


Working in momentum space, this becomes 


4 
sa fT TDU (4.30) 
BZ 


with the inverse propagator 


D(k) = Y 7 sin(k,a) (4.31) 


We again see the fermion doubling problem, now in the guise of poles in the propagator 
D~'(k) at k, = 0 and k, = 7/a. Since we have also discretised time, the problem has 
become twice as bad: there are now 24 = 16 poles. 


The Nielsen-Ninomiya theorem that we met earlier has a direct translation in this 
context. It states that it is not possible to write down a D(k) in (4.30) that obeys the 
following four conditions, 


e D(k) is continuous within the Brillouin zone. This means, in particular, that it 
is periodic in k. 


e D(k) ~% "k, when k < 1/a, so that the theory looks like a massless Dirac 
fermion when the momentum is small. 


e D(k) has poles only at k = 0. This is the requirement that there are no fermion 
doublers. As we’ve seen, this requirement doesn’t hold if we follow the naive 
discretization (4.31). 


e {7°, D(k)} =0. This is the statement that the theory preserves chiral symmetry. 
It is true for our naive approach (4.31), but this suffered from fermionic doublers. 
As we will see below, if we try to remove these we necessarily screw with chiral 
symmetry. Indeed, we saw a very similar story in Section 4.3.1 when we discussed 
fermions in d = 1+ 1 dimensions. 


What to make of this? Clearly, we’re not going to be able to simulate chiral gauge 
theories using these methods. But what about QCD? This is a non-chiral theory that 
involves only Dirac fermions. Even here, we have some difficulty because if we try to 
remove the doublers to get the right number of degrees of freedom, then we are going to 
break chiral symmetry explicitly. Of course, ultimately chiral symmetry will be broken 
by the anomaly anyway, but there’s interesting physics in that anomaly and that’s 
going to be hard to see if we’ve killed chiral symmetry from the outset. 


= 228 = 


What to do? Here are some possible approaches. We will discuss a more innovative 
approach in the following section. 


SLAC Fermions 


We’re going to have to violate one of the requirements of the Nielsen-Ninomiya theorem. 
One possibility is to give up on periodicity in the Brillouin zone. Now what goes up 
need not necessarily come down. We make the dispersion relation discontinuous at 
some high momentum. For example, you could just set D(k) = “k, everywhere, and 
suffer the discontinuity at the edge of the Brillouin zone. This, it turns out, is bad. A 
discontinuity in momentum space corresponds to a breakdown of locality in real space. 
The resulting theories are not local quantum field theories. They do not behave in a 
nice manner. 


Wilson Fermions 


As we mentioned above, another possibility is to kill the doublers, at the expense of 
breaking chiral symmetry. One way to implement this, first suggested by Wilson, is to 
add to the original action (4.30) the term 


i ar | dx Very = ar X yha) y K + aft) — 2u(x) + U(x — aft) 


zer L 


In momentum space, this becomes 


and we’re left with the inverse propagator 
4 k 
D(k) = 7"sin(k,a) + — sin? (=) (4.32) 
a 


This now satisfies the first three of the four requirements above, with all the spurious 
fermions at k, = 7/a lifted. The resulting dispersion relation is analogous to what 
we saw in d = 1+ 1 dimensions. The down side is that we have explicitly broken 
chiral symmetry, which can be seen by the lack of gamma matrices in the second term 
above. This becomes problematic when we consider interacting fermions, in particular 
when we introduce gauge fields. Under RG, we no longer enjoy the protection of chiral 
symmetry and expect to generate any terms which were previously prohibited, such 
as mass terms wy and dimension 5 operators wy“y"F,,.. Each of these must be fine 
tuned away, just like the mass of the scalar in Section 4.1. 
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Staggered Fermions 


The final approach is to embrace the fermion doublers. In fact, as we will see, we don’t 
need to embrace all 16 of them; only 4. 


To see this, we need to return to the real space formalism. At each lattice site, we 
have a 4-component Dirac spinor y(x). We denote the position of the lattice site as 
£ = a(nı, n2, Ng, N4), with n, E€ Z. We then introduce a new Dirac spinor x(x), defined 
by 


W(x) = 909907 73° V4" x(x) (4.33) 


In the action (4.29), we have y(x)” y(x + afi). Written in the y variable, the term 
7" y(x + ajt) will have two extra powers of y” compared to w(x); one from the explicit 
~” out front, and the other coming from the definition (4.33). Since we have (7)? = +1 


in Euclidean space, we will find 
yt (a + aft) = (—1)80Me Integer m1 oe Psy (a + aft) 


where the integer is determined by commuting various gamma matrices past each other. 
But this means that the integrand of the action has terms of the form 


w(x) y ylz + aft) = Ne X(x) x(x + aji) 


where there’s been some more commuting and annihilating of gamma matrices going 
on, resulting in the signs 


ny +n ny+n2+n3 


Del = 1 > T2 = (—1)™ > Tga = (—1) > r4 = (—1) 


The upshot is that the transformation (4.33) has diagonalised the action in spinor space. 
One can check that this same transformation goes through unscathed if we couple the 
fermion to gauge fields. This means that, on the lattice, we have 


det (i) = det 4(D) 


for some operator D. The operator D still includes contributions from the 16 fermions 
dotted around the Brillouin zone, but only one spinor index contribution from each. 
We may then take the fourth power and consider det(D) by itself. Perhaps surprisingly, 
one still finds a relativistic theory in the infra-red, with 4 of the 16 doublers providing 
the necessary spinor degrees of freedom. 
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Roughly speaking, you can think of the staggered fermions as arising from placing 
just a single degree of freedom on each lattice site. After doubling, we have 16 degrees 
of freedom living at the origin of momentum space and the corners of the Brillouin 
zone. The staggering trick is to recombine these 16 degrees of freedom back into 4 
Dirac spinors. The idea that some subset of the fermion doublers may play the role of 
spin sounds strange at first glance, but is realised in d = 2+ 1 dimensions in graphene. 


This staggered approach still leaves us with 16/4 = 4 Dirac fermions. At high energy, 
these are coupled in a way which is distinct from four flavours in QCD. Nonetheless, it 
is thought that, when coupled to gauge fields, the continuum limit coincides with QCD 
with four flavours which, in this context, are referred to as tastes. The lattice theory 
has a U(1) x U(1) chiral symmetry, less than the U(4) x U(4) chiral symmetry of the 
(classical) continuum but still sufficient to prevent the generation of masses. This is a 
practical advantage of staggered fermions. 


In fact, there are further reasons to be nervous about staggered fermions. As we’ve 
seen, the continuum limit results in 4 Dirac fermions. Let’s call them Yai, where 
a = 1,2,3,4 is the spinor index and i = 1,2, 3,4 is the taste (flavour) index. However, 
these spinor and tase indices appear on the same footing in the lattice: both come from 
doubling. This suggests that they will sit on the same footing in the continuum limit. 
But that’s rather odd. It means that, upon a Lorentz transformation A, the resulting 
Dirac spinors will transform as 


Vai > SIA]E SIA] Yo; 


with S[A] the spinor representation of the Lorentz transformation. (Since we’re in 
Euclidean space, it is strictly speaking just the rotation group SO(4).) The first term 
S[A] £ is the transformation property that we would expect of a spinor, but the second 
term S[A];/ is very odd, since these are flavour indices. In particular, it means that if 
we rotate by 27, we never see the famous minus sign acting on the staggered fermions. 
Instead we get two minus signs, one acting on the two indices, and the resulting object 


actually has integer spin! 


What’s going on here is that the object Pai is really a bi-spinor, in the sense that 
both a and į are spinor indices. In representation theory language, a Dirac spinor 
transforms as (3,0) @ (0,5). The staggered fermions then transform in 


[(3,0) © (0, 5)] 8 [(3,0) © (0, 5)] = 2(0, 0) © 2(5, 5) ® (1,0) ® (0,1) 


11 
22 
self-dual and anti-self-dual representations of 2-forms. In fact, formally, the collection 


Here (0,0) are scalars, (3, 3) is the vector representation, while (1,0) and (0,1) are the 
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of objects on the right can be written as a sum of forms of different degrees, 


b= 6% + Pdr" + dr” A da” + oO) da" Adr” Ada? + GD dx" Adz” \ da? Ada” 


where Poincaré duality means that the 4-form has the same degrees of freedom as a 
scalar and the 3-form the same as a vector. The 16 degrees of freedom that sit in 
the staggered fermions Pa; can then be rearranged to sit in ®. Moreover, the Dirac 
equation on w has a nice description in terms of these forms; it becomes 


(d — xd x +m)® = 0 
This is sometimes called a Dirac-Kähler field. 


The upshot is that staggered fermions don’t quite give rise to Dirac fermions, but a 
slightly more exotic object constructed in terms of forms. Nonetheless, this doesn’t stop 
people using them in an attempt to simulate QCD, largely because of the numerical 
advantage that they bring. Given the discussion above, one might be concerned that 
this is not quite a legal thing to do and it is, in fact, simulating a different theory. 


This is not the only difficulty with staggered fermions. The four tastes necessarily 
have the same mass meaning that, the problems above notwithstanding, staggered 
fermions do not allow us to get close to a realistic QCD theory, where the masses of 
the four lightest quarks are very different. To evade this issue, one sometimes attempts 
to simulate a single quark by taking yet another fourth-root, det !/ (D). It seems clear 
that this does not result in a local quantum field theory. Arguments have raged about 
how evil this procedure really is. 


4.4 Towards Chiral Fermions on the Lattice 


A wise man once said that, when deciding what to work on, you should first evaluate 
the importance of the problem and then divide by the number of people who are 
already working on it. By this criterion, the problem of putting chiral fermions on the 
lattice ranks highly. There is currently no fully satisfactory way of evading the Nielsen- 
Ninomiya theorem. This means that there is no way to put the Standard Model on a 
lattice. 


On a practical level, this is not a particularly pressing problem. It is the weak sector 
of the Standard Model which is chiral, and here perturbative methods work perfectly 
well. In contrast, the strong coupling sector of QCD is a vector-like theory and this is 
where most effort on the lattice has gone. However, on a philosophical level, the lack of 
lattice regularisation is rather disturbing. People will bang on endlessly about whether 
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or not we live “the matrix”, seemingly unaware that there are serious obstacles to 
writing down a discrete version of the known laws of physics, obstacles which, to date, 
no one has overcome. 


In this section, I will sketch some of the most promising ideas for how to put chiral 
fermions on a lattice. None of them quite works out in full — yet — but may well do in 
the future. 


4.4.1 Domain Wall Fermions 


Our first approach has its roots in the continuum, which allows us to explain much of 
the basic idea without invoking the lattice. We start by working in d = 4+1 dimensions. 
The fifth dimension will be singled out in what follows, and we refer to it as x° = y. 


In d= 4 + 1, the Dirac fermion has four components. The novelty is that we endow 
the fermion with a spatially dependent mass, m(y) 


ipy + iy Oy — m(y)y = 0 (4.34) 
where we pick the boundary conditions 

m(y) +>+M_ as y > +00 
with M > 0. We will take the profile m(y) to be mo 


monotonic, with m(y) = 0 only at y = 0. A typical 
form of the mass profile is shown in the figure. Pro- 


files of this kind often arise when we solve equations 
which interpolate between two degenerate vacua. In 
that context, they are referred to as domain walls 


and we'll keep the same terminology, even though we 
have chosen m(y) by hand. Figure 44: 
The fermion excitation spectrum includes a contin- 

uum of scattering states with energies E > M which can exist asymptotically in the y 
direction. At these energies, physics is very much five dimensional. But there are also 
states with Æ < M which are bound to the wall. If we restrict to these energies then 
physics is essentially four dimensional. In this sense, the mass M can be thought of as 
an unconventional cut-off for the four dimensional theory on the wall. 
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In the chiral basis of gamma matrices, 


“(02 ; wN 155. {i 9 
yy = ? = . » ’= 1,4, Fi TS 
10 —o' 0 0 -i 


where the factors of 7 in 7° reflect the fact that we’re working in signature (+, —, —, —, —). 
The Dirac equation becomes 


iðop- + io Oib_ — ðs, = m(y)d4 
1p — 10° Ajab. + Os 
where Y% = (w,,~_)?. There is one rather special solution to these equations, 


alesy) = exp (- fo mo) TETTE 


The profile is supported only in the vicinity of the domain wall; it dies off exponentially 


|l 
3 
S 
i 


~eMlyl as y — +oo. Importantly, there is no corresponding solution for q_, since 
the profile must be of the form exp (+ f dy’m/(y')) which now diverges exponentially in 
both directions. 


The two-component spinor y,(2) obeys the equation for a right-handed Weyl fermion, 
xe — 0 Yy =0 

We see that we can naturally localise chiral fermions on domain walls. The existence 

of this mode, known as a fermion zero mode, does not depend on any of the detailed 


properties of m(y). We met a similar object in Section 3.3.4 when discussing the 
topological insulator. 


This is interesting. Our original 5d theory had no hint of any chiral symmetry. But, 
at low-energies, we find an emergent chiral fermion and an emergent chiral symmetry. 


Implications for the Lattice 


So far, our discussion in this section has taken place in the continuum. How does it 
help us in our quest to put chiral fermions on the lattice? 


The idea to apply domain wall fermions to lattice gauge theory is due to Kaplan. 
At first sight, this doesn’t seem to buy us very much: a straightforward discretisation 
of the Dirac equation (4.34) shows that the domain wall does nothing to get rid of the 
doublers: in Euclidean space there are now 2* right-handed fermions y+, with the new 
modes sitting at the corners of the Brillouin zone as usual. Moreover, on the lattice 
one also finds a further 24 left-moving fermions y_. This brings us right back to a 
vector-like theory, with 2+ Dirac fermions. 
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However, the outlook is brighter when we add a 5d Wilson term (4.32) to the problem. 
By a tuning the coefficient to lie within a certain range, we can not only remove all of 
the 16 left-handed fermions y_, but we can remove 15 of the 16 right-handed fermions. 
This leaves us with just a single right-handed Dirac fermion localised on the domain 
wall. 


It is surprising that the Wilson term (4.32) can remove an odd number of gapless 
fermions from the spectrum since everything we learned up until now suggests that 
gapless modes can only be removed in pairs. But we have something new here, which 
is the existence of the infinite fifth dimension. This gives a novel mechanism by which 
zero modes can disappear: they can become non-normalisable. 


There is an alternative way to view this. Suppose that we make the fifth direction 
compact. Then the domain wall must be accompanied by an anti-domain wall that 
sits at some distance L. While the domain wall houses a right-handed zero mode, the 
anti-domain wall has a left-handed zero mode. Now Nielsen-Ninomiya is obeyed, but 
the two fermions are sequestered on their respective walls, with any chiral symmetry 
breaking interaction suppressed by e~//*. 

I will not present that analysis that leads to the conclusions above. But we will 
address a number of questions that this raises. First, what happens if we couple the 
chiral mode on the domain wall to a gauge field? Second, how has the single chiral 
mode evaded the Nielsen-Ninomiya theorem? 


4.4.2 Anomaly Inflow 


We have seen that a domain wall in d = 4+ 1 dimension naturally localises a chiral 
d = 3 + 1 fermion. This may make us nervous: what happens if we now couple the 
system to gauge fields? 


At low energies, the only degree of freedom is the zero mode on the domain wall, so 
we might think it makes sense to restrict our attention to this. (We’ll see shortly that 
things are actually a little more subtle.) Let us introduce a U(1) gauge field everywhere 
in d = 4+1 dimensional spacetime, under which the original Dirac fermion w has charge 
+1. 


We haven’t yet discussed gauge theories in d = 4 = 1 dimensions, although we’ll 
learn a few things below. The first statement we’ll need is that there are no chiral 
anomalies in odd spacetime dimensions. This is because there is no analog of 7°. We 
might, therefore, expect that a U(1) gauge theory coupled to a single Dirac fermion is 
consistent in d = 4+ 1 dimensions. We will revisit this expectation shortly. 
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However, from a low energy perspective we seem to be in trouble, because there is 
a single massless chiral fermion y+ on the domain wall which has charge +1 under 
the gauge field. The fact that the gauge field extends in one extra dimension does not 
stop the anomaly which is now restricted to the region of the domain wall. Under the 
assumption that the zero mode is restricted to the y = 0 slice, the anomaly (3.34) for 
the gauge current 


Oud! = age Fr Foe 5(9) (4.35) 
It is a factor of 1/2 smaller than the chiral anomaly for a Dirac fermion because we 
have just a single Weyl fermion. This is bad: if the U(1) gauge field is dynamical then 
this is precisely the form of gauge anomaly that we cannot tolerate. Indeed, as we saw 
in (3.33), under a gauge transformation A, — A, + ð w(x, y), the measure for the 4d 
chiral fermion will transform as 


[ror > [ox ox (z [etx w0) oO FF (4.36) 


Fortunately, there is another phenomenon which will save us. Let’s return to d= 4+1 
dimensions. Far from the domain wall, the fermion is massive and we can happily 
integrate it out. You might think that as m — oo, the fermion simply decouples 
from the dynamics. But that doesn’t happen in odd spacetime dimensions. Instead, 
integrating out a massive fermions generates a term that is proportional to sign(m), 


Scs = ~ 5472 pe EPPA, Ap OA) (4.37) 
with 
p im 
2|m| 


This is a Chern-Simons term and k is referred to as the level. We will discuss the 
corresponding term in d = 2+ 1 dimensions in some detail in Section 8.4. We will also 
perform the analogous one-loop calculation in Section 8.5 and show how the Chern- 
Simons term, proportional to the sign of the mass, is generated when a Dirac fermion 
is integrated out. The calculation necessary to generate (4.37) is entirely analogous. 


Under a gauge transformation A, —> A, + ô w, the Chern-Simons action (4.37) 
transforms as 


Scs = — fèr Oni (Pwd. A, Ax) 


QA 
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This is a total derivative. Under most circumstances, we can simply throw this away. 
But there are some circumstances when we cannot, and the presence of a domain wall 
is one such an example. We take the thin wall limit, in which we approximate 


m “tn y <0 


m| ) 41 y>0 


Since the level is now spatially dependent, we should put it inside the integral. After 
some integration by parts, we then find that the change of the Chern-Simons term is 
then 


3 5 m ee 
= — = “PIR A A 
Scs I fa £ Imj” (we Ou v Op z) 
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We see that this precisely cancels the gauge transformation that comes from the chiral 
fermion (4.36). A very similar situation occurs in the integer quantum Hall effect, 
where a 2d chiral fermion on the boundary compensates the lack of gauge invariance 
of a d = 2 + 1 dimensional Chern-Simons theory in the bulk. This was described in the 
lectures on the Quantum Hall Effect. 


We learn that the total theory is gauge invariant, but only after we combine two 
subtle effects. In particular, the anomalous current (4.35) on the domain wall is real. 
A low energy observer, living on the wall, would see that the number of fermions is 
not conserved in the presence of an electric and magnetic field. But, for a higher 
dimensional observer there is no mystery. The current is generated in the bulk (strictly 
speaking, at infinity) by the Chern-Simons term, 

= Scs [A] 1 m 


jt= aa pv por F,,F, 
5A, 3272 [m] le 


The current is conserved in the bulk, but has a non-vanishing divergence on the domain 
wall where it is cancelled by the anomaly. This mechanism is referred to as anomaly 
inflow. 


There is one final subtlety. I mentioned above that the five-dimensional Maxwell 
theory coupled to a single Dirac fermion is consistent. This is not quite true. Even in 
the absence of a domain wall, one can show that the 5d Chern-Simons (4.37) theory 
is invariant under large gauge transformations only if we take k € Z. (We'll explain 
why this is for 3d Chern-Simons theories in Section 8.4.) But integrating out a massive 
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fermion gives rise to a half-integer k rather than integer. In other words, even in the 
absence of a domain wall the 5d theory is not quite gauge invariant. This doesn’t 
invalidate our discussion above; we can simply need to add a bare Chern-Simons term 
with level k = 1/2 so that, after integrating out the massive fermion, the effective level 
is k = 1 when y > 0 and k = 0 when y < 0. (This discussion is slightly inaccurate: 
we'll have more to say on these issues in Section 8.5.) 


4.4.3 The Ginsparg-Wilson Relation 


We have not yet addressed exactly how the domain wall fermion evades the Nielsen- 
Ninomiya theorem. Here we explain the loophole. The idea that follows is more general 
than the domain wall, and goes by the name of overlap fermions. 


Rather than jump straight to the case of a Weyl fermion, let’s first go back and think 
about a Dirac fermion. We take the action in momentum space to be 


1 d'k « 
S=- —— Y-k D(k 
a yg Bayt HPO 
for some choice of inverse propagator D(k). As explained in Section 4.3.4, the Nielsen- 
Ninomiya theorem can be cast as four criterion which cannot all be simultaneously 
satisfied by D(k). One of these is the requirement that the theory has a chiral symmetry, 


in the guise of 
{7°, D(k)}=0 


The key idea is to relax this constraint, but relax it in a very particular way. We will 
instead require 


(0#, D(k)} = aD D (4.38) 
This is the Ginsparg- Wilson relation. Note the presence of the lattice spacing a on the 


right-hand-side. This means that in the continuum limit, which is naively a > 0, we 
expect to restore chiral symmetry. 


In fact, the Ginsparg-Wilson relation ensures that a chiral symmetry exists at all 
scales. However, it’s rather different from the chiral symmetry that we’re used to. It’s 
simple to check that the action is invariant under 


- - a 
Sy =i? (1 = <D) Y , w=w (1 = <D) mi (4.39) 
These transformation rules have the strange property that the amount a fermion is 
rotated depends on its momentum. In real space, this means that the symmetry does 


not act in the same way on all points of the lattice. In the language of condensed 
matter physics, it is not an onsite symmetry. This will cause us a headache shortly. 
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So the Ginsparg-Wilson relation (4.38) is sufficient to guarantee a chiral symmetry, 
albeit an unconventional one. The next, obvious question is: what form of D obeys 
this relation? It’s perhaps simplest to give a solution in the continuum, where a = 1/M 
is simply interpreted as some high mass scale. You can check that, in real (Euclidean) 
space, the following operator obeys the Ginsparg-Wilson relation, 


T ERE aT 


This is the overlap operator. It obeys the Hermiticity property D! = y°Dy°. At low 
momenta, a < 1, we reproduce the usual Dirac operator, 


D=@+... 


At high momentum, things look stranger. In particular, the derivatives in the denom- 
inator mean that this operator is non-local. However, it’s not very non-local, and can 
be shown to fall off exponentially at large distances. 


The Ginsparg-Wilson relation relies only on the gamma matrix structure of the 
operator (4.40). This means that we can also write down operators on the lattice, 
simply by replacing ø by the operator appropriate for, say, Wilson fermions (4.32). 
Moreover, we can couple our fermions to gauge fields simply by replacing @ with Ð, or 
its lattice equivalent. 


Next, we can try to use this chiral symmetry to restrict the Dirac fermion to an 
analog a Weyl fermion. Usually this is achieved by using the projection operators 


1 
P= 07 
= (+7) 
For overlap fermions, we need a different projection operator. This is 


Pes ; (1+ 7°(1—aD)) 


You can check that this obey P? = P} and P,P; = 0, using the Ginsparg-Wilson 


relation (4.38). To write down the theory in terms of chiral fermions, we actually need 
both projection operators: the action can be expressed as 


4 
sae f Eea TP, +P) DK) (P, +P) th 


a Jaz (27T) 
il dtk f- i = x 
= E [P-P D(k) Py Yk + P-P- D(k) Ê- dx 


Throwing away one of these terms can then be thought of as a chiral fermion. It can be 
shown that if one writes down a strict 4d action for the domain wall fermion, it takes 
a form similar to that above. 
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It seems like we have orchestrated a way to put a chiral fermion on the lattice, albeit 
with a number of concessions forced upon us by the strange Ginsparg-Wilson relation. 
So what’s the catch? The problem comes because, although the action is invariant 
under (4.39), the measure is not. The measure for a Dirac fermion transforms as 


§ [DEDY] = DYDY Trin’ (1- $D) +i (1- $d) 95) 
= DYDY Tr [—ia7’D] 


This now smells like the way the anomaly shows up in the continuum. Except here, 
the lack of invariance shows up even before we couple to gauge fields. If we also include 
gauge fields, and project onto a chiral fermions, then we run into trouble. In general, 
the measure will not be gauge invariant. This, of course, is the usual story of anomalies. 
However, now life has become more complicated, in large part because of the non-onsite 
nature of the chiral transformation. What we would like to show is that the measure 
remains gauge invariant if and only if the matter coupling does not suffer a gauge 
anomaly. This was studied in some detail by Lüscher. The current state of the art is 
that this technique can be shown to be consistent for Abelian, chiral gauge theories, 
but open questions remain in the more interesting non-Abelian case. 


4.4.4 Other Approaches 


There is one final assumption of the Nielsen-Ninomiya theorem that we could try to 
leverage in an attempt to put chiral fermions on the lattice: this is the assumption 
that the fermions are free, so that we can talk in terms of a one-particle dispersion 
relation. One might wonder if it’s possible to turn on some interactions to lift collections 
of gapless fermions in a manner consistent with °t Hooft anomalies, while preserving 
symmetries which you might naively have thought should be broken. There has been 
a large body of work on this topic, which now goes by the name of symmetric mass 
generation, starting with Eichten and Preskill. It’s interesting. 


4.5 Further Reading 


Kenneth Wilson is one of the more important figures in the development of quantum 
field theory. His work in the early 1970s on the renormalisation group, largely driven 
by the need to understand second order phase transitions in statistical physics, had 
an immediate impact on particle physics. The older ideas of renormalisation, due to 
Schwinger, Tomonaga, Feynman and Dyson, appeared to be little better than sweeping 
infinites under the carpet. Viewed through Wilson’s new lens, it was realised that these 
infinities are telling us something deep about the way Nature appears on different length 
scales. 
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Wilson’s pioneering 1974 paper on lattice gauge theory showed how to discretize a 
gauge theory, and demonstrated the existence of confinement in the strong coupling 
regime [214]. He worked only with U(1) gauge group, although this was quickly gener- 
alised to a large number of non-Abelian gauge theories [8]. The Hamiltonian approach 
to lattice gauge theory was developed soon after by Kogut and Susskind [124]. 


In fact, Wilson was not the first the to construct a lattice gauge theory. A few years 
earlier, Wegner described a lattice construction of what we now appreciate as Z gauge 
theory [201]. The lattice continued to play a prominent role in many subsequent con- 
ceptual developments of quantum field theory, not least because such a (Hamiltonian) 
lattice really exists in condensed matter physics. Elitzur’s theorem was proven in [52]. 


Wilson’s original lattice gauge theory paper does not mention that a discrete version 
of the theory lends itself to numerical simulation, but this was surely on his mind. 
He later used numerical renormalisation group techniques [215] to solve the Kondo 
problem — a sea of electrons interacting with a spin impurity — which also exhibits 
asymptotic freedom [125]. It wasn’t until the late 1970s that people thought seriously 
about simulating Yang-Mills on the lattice. The first Monte Carlo simulation of four 
dimensional Yang-Mills was performed by Creutz in 1980 [33]. 


More details on the basics of lattice gauge theory can be found in the book by Creutz 
[34] or the review by Guy Moore [138]. 


Fermions on the Lattice 


Wilson introduced his approach to fermions, giving mass to the doublers at the corners 
of the Brillouin zone, in Erice lectures in 1975. To my knowledge, this has never been 
published. Other approaches soon followed: the discontinuous SLAC derivative in [48], 
and the staggered approach in [124]. The general problem of putting fermions on the 
lattice was later elaborated upon by Susskind [188]. The “rooting” trick, to reduce 
the number of staggered fermions, is prominently used in lattice simulations, but its 
validity remains controversial: see [180, 35] for arguments. 


The idea that placing fermions on the lattice is a deep, rather than irritating, problem 
is brought into sharp focus by the theorem of Nielsen and Ninomiya [146, 147]. 


The story of domain wall fermions has its origins firmly in the continuum. Jackiw 
and Rebbi were the first to realise that domain walls house chiral fermions [111], a result 
which now underlies the classification of certain topological insulators. The interaction 
of these fermions with gauge fields was studied by Callan and Harvey who introduced 
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the idea of anomaly inflow [25]. The fact that this continuum story can be realised in 
the lattice setting was emphasised by David Kaplan [117]. 


In a parallel development, the Ginsparg-Wilson relation was introduced in [73] in a 
paper that sat unnoticed for many years. Maybe it would have helped if the authors 
were more famous. The first (and, to date, only) solution to this relation was discov- 
ered by Neuberger [143, 144], and the resulting exact chiral symmetry on the lattice 
was shown by Lüscher [126]. The relationship between domain wall fermions and the 
Ginsparg- Wilson relation was shown in [121]. 


The idea that strong coupling effects could lift the fermion doublers, in a way consis- 
tent with (t Hooft) anomalies, was first suggested by Eichten and Preskill [51]. This 
subject has had a renaissance of late, starting with the pioneering work of Fidkowski 
and Kitaev on interacting 1d topological insulators [58, 59]. They show that Majorana 
zero modes can be lifted, preserving a particular time reversal symmetry, only in groups 
of 8. There are more conjectural extensions to higher dimensions where, again, it is 
thought that only specific numbers of fermions can be gapped together. In d = 3 + 1, 
the conjecture is that Weyl fermions can become gapped in groups of 16; the fact that 
the Standard Model (with a right-handed neutrino) has 16n Weyl fermions has not 
escaped attention [202, 233]. 


The lectures by Witten on topological phases of matter include a clear discussion of 
the Nielsen-Ninomiya theorem [229]. Excellent reviews on the issues surrounding chiral 
fermions on the lattice have been written by Lüscher [127] and Kaplan [118]. 
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5. Chiral Symmetry Breaking 


In this section, we discuss the following class of theories: SU(N.) gauge theory coupled 
to Ny Dirac fermions, each transforming in the fundamental representation of the the 
gauge group. A particularly important member of this class is QCD, the theory of the 
strong nuclear interactions, and we will consider this specific theory in some detail in 
Section 5.4. Furthermore, throughout this section we will adopt various terminology of 
QCD. For example, we will refer to the fermions throughout as quarks. 


It turns out that the most startling physics occurs when we take the fermions to 
be massless. For this reason, we will start our discussion with this case, and delay 
consideration of massive fermions to Section 5.2.3. The Lagrangian of the theory is 


Ny 
j 3 m 
L= Eo Big + > - ipi Dy; (5.1) 


where Dy = py — iytA p. Here i = 1,...,Np labels the species of quark and is 
sometimes referred to as a flavour index. (Note that ~ also carries a colour index that 
runs from 1 to Ne and is suppressed in the expressions above.) 


Much of what we have to say below will follow from the global symmetries of the 
theory (5.1). Indeed, the theory has a rather large symmetry group which is only 
manifest when we decompose the fermionic kinetic terms into into left-handed and 
right-handed parts 


Ng Nj 
> di Di = D0 i a" Duby + it oD 
i=1 i=1 
Written in this way, we see that the classical Lagrangian has the symmetry 
Gp = U(N5)1 x U(Np)R 
which acts as 


U(Ny)z : Yi > Lge and U(Ny)r : Phi => RijY+j (5.2) 


where both L and R are both Ny x Ny unitary matrices. As we will see in some detail 
below, in the quantum theory different parts of this symmetry group suffer different 
fates. 
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Perhaps the least interesting is the overall U(1)y, under which both w_ and Y, 
transform in the same way: w+; 3 ews ;. This symmetry survives and the associated 


conserved quantity counts the number of quark particles of either handedness. In the 
context of QCD, this is referred to as baryon number. 


The other Abelian symmetry is the axial symmetry, U(1),4. Under this, the left- 


handed and right-handed fermions transform with an opposite phase: Y+; + e*’ w+. 
We already saw the fate of this symmetry in Section 3.1 where we learned that it suffers 
an anomaly. 


This means that the global symmetry group of the quantum theory is 
Gr =U(1)y x SU(N) x SU(Np)R (5.3) 


In this section, our interest lies in what becomes of the two non-Abelian symmetries. 
These act as (5.2), but where L and R are now each elements of SU (N+) rather than 
U(Ny). 


5.1 The Quark Condensate 


As we’ve seen in Section 2.4, the dynamics of our theory depends on the values of Nẹ 
and N.. For low enough Ny, we expect that the low-energy physics will be dominated 
by two logically independent phenomena. We have met the first of these phenomena 
already: confinement. In this section, we will explore the second of these phenomena: 
the formation of a quark condensate. 


The quark condensate — also known as a chiral condensate — is a vacuum expectation 
value of the composite operators w_;(x)w4;(x). (As usual in quantum field theory, one 
has to regulate coincident operators of this type to remove any UV divergences). It 
turns out that the strong coupling dynamics of non-Abelian gauge theories gives rise 
to an expectation value of the form 


(bibs) = — oij (5.4) 


Here ø is a constant which has dimension of [Mass]? because a free fermion in d = 3+1 
has dimension [i] = 2. (An aside: in Section 2 we referred to the string tension as g; 
it’s not the same object that appears here.) The only dimensionful parameter in our 
theory is the strong coupling scale Agcp, so we expect that parameterically o ~ Abc D> 
although they may differ by some order 1 number. 
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There are a couple of obvious questions that we can ask. 
e Why does this condensate form? 
e What are the consequences of this condensate? 


The first of these questions is, like many things in strongly coupled gauge theories, 
rather difficult to answer with any level of precision, and a complete understanding is 
still lacking. In what follows, we will give some heuristic arguments. In contrast, the 
second question turns out to be surprisingly straightforward to answer, because it is 
determined entirely by symmetry. We will explore this in Section 5.2. 


Why Does the Quark Condensate Form? 


The existence of a quark condensate (5.4) is telling us that the vacuum of space is 
populated by quark-anti-quark pairs. This is analogous to what happens in a super- 
conductor, where pairs of electron condense. 


In a superconductor, the instability to formation of an electron condensate is a result 
of the existence of a Fermi surface, together with a weak attractive force mediated by 
phonons. In the vacuum of space, however, things are not so easy. The formation of 
a quark condensate does not occur in weakly coupled theory. Indeed, this follows on 
dimensional grounds because, as we mentioned above, the only relevant scale in the 
game is Agcp 


To gain some intuition for why a condensate might form, let’s look at what happens 
at weak coupling g? < 1. Here we can work perturbatively and see how the gluons 
change the quark Hamiltonian. There are two, qualitatively different effects. The first 
is the kind that we already met in Section 2.5.1; a tree level exchange of gluons gives 
rise to a force between quarks. This takes the form 


TOOR 


As we saw in Section 2.5.1, the upshot of these diagrams is to provide a repulsive 
force between two quarks in the symmetric channel, and an attractive force in the anti- 
symmetric channel. Similarly, a quark-anti-quark pair attract when they form a colour 
singlet and repel when they form a colour adjoint. 
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The second term is more interesting for us. The relevant diagrams take the form 


AH, = 9° dot yee Ne! 


The novelty of these terms is that they they provide matrix elements which mix the 
empty vacuum with a state containing a quark-anti-quark pair. In doing so, they 
change the total number of quarks + anti-quarks; 


The existence of the quark condensate (5.4) is telling us that, in the strong coupling 
regime, terms like AH, dominate. The resulting ground state has an indefinite number 
of quark-anti-quark pairs. It is perhaps surprising that we can have a vacuum filled 
with quark-anti-quark pairs while still preserving Lorentz invariance. To do this, the 
quark pairs must have opposite quantum numbers for both momentum and angular 
momentum. Furthermore, we expect the condensate to form in the attractive colour 
singlet channel, rather than the repulsive adjoint. 


The handwaving remarks above fall well short of demonstrating the existence the 
quark condensate. So how do we know that it actually forms? Historically, it was 
first realised from experimental considerations since it explains the spectrum of light 
mesons; we will describe this in some detail in Section 5.4. At the theoretical level, the 
most compelling argument comes from numerical simulations on the lattice. However, 
a full analytic calculation of the condensate is not yet possible. (For what it’s worth, 
the situation is somewhat better in certain supersymmetric non-Abelian gauge theories 
where one has more control over the dynamics and objects like quark condensates can 
be computed exactly.) Finally, there is a beautiful, but rather indirect, argument which 
tells us that the condensate (5.4) must form whenever the theory confines. We will give 
this argument in Section 5.6. 


5.1.1 Symmetry Breaking 


Although the condensate (5.4) preserves the Lorentz invariance of the vacuum, it does 
not preserve all the global symmetries of the theory. To see this, we can act with a 
chiral SU(Ny), x SU(Np)r rotation, given by 


Y-i hyp; and Ypi Rijp 


The ground state of the our theory is not invariant. Instead, the condensate transforms 
as 


(bith +;) ae o(L'R)ij 
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This is an example of spontaneous symmetry breaking which, in the present context, 
is known as chiral symmetry breaking (sometimes shortened to ySB). We see that the 
condensate remains untouched only when L = R. This tells us that the symmetry 
breaking pattern is 


Gr = U(1)v x SU(Ne)z x SU(N¢)r —> U(1)v x SU(Nys)v (5.5) 


where SU(Ny)vy is the diagonal subgroup of SU(Ny), x SU(Ny)r. The purpose of this 
chapter is to explore the consequences of this symmetry breaking. As we will see, the 
consequences are astonishingly far-reaching. 


Other Symmetry Breaking Patterns 


Throughout this chapter, we will only discuss the symmetry breaking pattern (5.5), 
since this is what is observed in QCD. But before we move on, it’s worth briefly men- 
tioning that other gauge theories can exhibit different symmetry breaking patterns. 


For example, consider a SO(NV) gauge theory coupled to a Ny Dirac fermions in the 
N-dimensional vector representation. In contrast to the SU(N) gauge theory described 
above, the vector representation of SO(N) is real. This means that we can equivalently 
describe the system as having 2N; Weyl fermions, each of which transform in the same 
vector representation. Correspondingly, the global symmetry group of this theory is 


Gr = SU(2Ny,) 
A chiral condensate of the form (5.4) will spontaneously break 
GF = SU(2N+) —> O(2N,) 


Symmetry breaking patterns of this type are typical for fermions in real representations 
of the gauge group. 


The other representative symmetry breaking pattern occurs for Sp(N) gauge groups, 
again coupled to Ny Dirac fermions in the fundamental (2.V-dimensional) representa- 
tion. This representation is pseudo-real; if you take the complex conjugate you can 
turn it back into the original representation through the use of an anti-symmetric in- 
variant tensor J®. (A familiar example is SU(2) = Sp(1) where you can turn a 2 
representation into a 2 representation by multiplying by the e® invariant tensor.) This 
meanst that, once again, the global symmetry group is Gr = SU(2Ny). However, this 
time when the chiral condensate (5.4) forms, it spontaneously breaks 


Symmetry breaking patterns of this type are typical for fermions in pseudo-real repre- 
sentations. 
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5.2 The Chiral Lagrangian 


The existence of a spontaneously broken symmetry (5.5) immediately implies a whole 
slew of interesting phenomena. First, the vacuum of our theory is not unique. Instead, 
there is a manifold of vacua, parameterised by the condensate 


(Pipi) = —o Uij 


where U € SU(Ny). Next, Goldstone’s theorem tells us that there are massless particles 
in the spectrum. These are bound states of the original quarks, but are now best 
thought of as long-wavelength ripples of the condensate, where it’s value now varies in 
space and time: U = U(x). Note that there are N? — 1 such Goldstone bosons, one for 
each broken generator in (5.5). We parameterise these excitations by writing 


U(x) = exp (> r) with a(x) = nr°(x) T" (5.6) 


T 


Here q(x) is valued in the Lie algebra su( Np). The matrices Tf; are the generators of 
the su( Np) and the component fields 7°(x), labelled by a = 1,..., N7 are called pions. 


(As we explain in Section 5.4, these are named after certain mesons in QCD.) 


We have also introduced a dimensionful constant f, in the definition (5.6). For 
now, this ensures that the pions have canonical dimensions for scalar fields in four 
dimensions. It is sometimes called the pion decay constant, although this name makes 
very little sense in our current theory because the pions are stable, massless excitations 
and don’t decay. We’ll see where the name comes from in Section 5.4.3 when we discuss 
how these ideas manifest themselves in the Standard Model. 


The Low-Energy Effective Action 


We would now like to understand the dynamics of the massless Goldstone modes. As we 
will see, at low-energies, the form of this action is entirely determined by the symmetries 
of the theory. 


To proceed, we want to construct a theory of the Goldstone modes U. We will 
require that our theory is invariant under the full symmetry global chiral symmetry 
Gr = U(1)v x SU(N) x SU(Np)r, under which 


U(x) + LİU (x)R 


What kind of terms can we add to the action consistent with this symmetry? The 
obvious term, tr UU = 1 because U € SU(N), and so cannot appear in the action. 
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(Here the trace is over the Ny flavour indices). Happily, this is consistent with the fact 
that U is a massless Goldstone field and it means that we need to look for terms which 
depend on the spacetime derivatives, 0,U. There are, of course, many such terms. 
However, our interest is in the low-energy dynamics which, since we have only massless 
particles, is the same thing as the long-wavelength physics. This means that the most 
important terms are those with the fewest derivatives. 


The upshot of these arguments is that the low-energy effective Lagrangian can be 
written as a derivative expansion. The leading term has two derivatives. At first glance, 
it looks as if there are three different candidates: 


wUoU) , tr(ð”Uta,U) , to, 
H H H 


However the first term vanishes because UTOU is an su( N) generator and, hence, trace- 
less. Furthermore, we can use the fact that U'QU = —(QU')U to write the third term 
in terms of the second. This means that, at leading order, there is unique action that 
describes the dynamics of pions, 
2 

Ly =F tr (O“UT 3 U) (5.7) 
This is the chiral Lagrangian. Although the Lagrangian is very simple, this is not a 
free theory because U is valued in SU(N). In fact, this is an example of an important 
class of scalar field theories in which the fields are coordinates on some manifold which, 
in the present case, is the group manifold SU(N;). Theories of this type are called 
non-linear sigma models and arise in many different areas of physics. 


Historically, the chiral Lagrangian was the first example of a non-linear sigma model, 
first introduced by Gell-Mann and Lévy in 1960. The origin of the name “sigma-model” 
is rather strange: the “sigma-particle” is a particular meson in QCD which, it turns 
out, is the one particle that is not captured by the sigma-model! We will explain this 
a little more in Section 5.4. 


For now, the fact that U is valued in SU(N) has a rather straightforward conse- 
quence: it means that we cannot set U = 0. Indeed, our sigma-model describes a 
degeneracy of ground states, but in each of them U # 0. This ensures that the chiral 
Lagrangian spontaneously breaks the SU(N), x SU(Np)r symmetry, as it must. 


5.2.1 Pion Scattering 


The beauty of the chiral Lagrangian is that it contains an infinite number of interaction 
terms, packaged in a simple form by the demands of symmetry. To see these interactions 
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more explictly, we rewrite the chiral Lagrangian in terms of the pion fields defined in 
(5.6). Keeping only terms quadratic and quartic, the chiral Lagrangian £2 becomes 


2 
L = tr (ðr)? — za" (n° (ðr)? — (rðr) ) +... (5.8) 
Note that if we use tr TT’ = 46° for su( Np) generators, then the kinetic term has the 
standard normalisation for each pion field: tr (Om)? = 4 O"1°0,7%. 


An Example: N; = 2 


For concreteness, we work with Ny = 2 and take the su(2) generators to be proportional 
to the Pauli matrices: T°“ = žo". The interaction terms then read 

Lin = ——- (nr Or Or? — n ðr rtn? 

= Ga ) 

From this we can read off the tree-level 77 — mr scattering amplitude using the 
techniques that we described in the Quantum Field Theory lectures. We label the two 
incoming momenta as pa and p and the two outgoing momenta as p. and pg. The 
amplitude is 


: aoc 1 a Ci 
A al 68t (4(Da- Po + Be Pa) + 2(Pa * Pe + Pa Pa + Po Pe + Po pa)) 


+ (boc) +(bod) 


Momentum conservation, Da + pp = Pe + pa, ensures that some of these terms cancel. 
This is perhaps simplest to see using Mandelstam variables which, because all particles 
are massless, are defined as 


5 = (Pa + Ps)? = 2Pa ` Po = 2Pe* Pa 
t = (Pa — Pe)” = —2Pa* Pe = —2Py ` Da 


u = (Pa — pa)? = —2pa ` Pa = — 2P ` Pe 


Using the relation s +t + u = 0, the amplitude takes the particularly simple form, 


i 4ed Z - [aas i 5266044 a 524 5 y 


T 


Above we have worked at tree level, keeping only the two-derivative terms. We can try 
to improve our results in two ways: we can include higher derivative terms in the chiral 
Lagrangian, and we can try to calculate diagrams at one-loop level and higher. 
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At the next order in the derivative expansion, there are three independent terms. 
We have £L = Lo + L4 with 


Ly = a; (tr “Ut OU)” + ag (tr 3 U? 8U) (tr “UT PU) 
+agtr (0,U' 0"U0,Ut a’U) (5.9) 


Here a; are dimensionless coupling constants. These terms will provide corrections to 
pion-pion scattering that are suppressed at low energy by powers of E/ fr 


Next: loops. The chiral Lagrangian (5.7) is non-renormalisable which means that 
we need an infinite number of counterterms to regulate divergences. However, this 
shouldn’t be viewed as any kind of obstacle; the theory is designed only to make sense 
up to a UV cut-off of order f,. As long as we restrict our attention to low-energies, the 
theory is fully predictive. 


In fact, there is a slightly more interesting story here which I will not describe in 
detail. If you compute the one-loop correction to pion scattering from £», you will find 
that it scales as pf log p?. The presence of the logarithm means that this term cannot 
be generated by a tree graph from higher order terms in the chiral Lagrangian and, 
indeed, at low-energies is enhanced relative to the contributions from £4. 


Furthermore, it turn out that there is a term more important than £4 that we’ve 
missed. This is known as the Wess-Zumino-Witten term. It doesn’t contribute to pion 
scattering, so we can neglect it for the purposes above. However, it plays a key role in 
the overall structure of the theory. We will discuss this term in detail in Section 5.5. 


5.2.2 Currents 


We started our discussion with the microscopic non-Abelian gauge theory (5.1) and 
have ended up, at low-energies, with a very different looking theory (5.7). In general, 
it is useful to know how operators in the UV get mapped to operators in the IR. There 
is one class of operators for which this map is particularly straightforward: these are 
the currents associated to the SU(Np)z x SU(Ny)p chiral symmetry. 


In the microscopic theory, the flavour currents are written most simply in terms of 
the vector and axial combinations: Jý, = Jj, + JR, and Jå, = Jz, — Jk, with the 
familiar expressions 


Ji = PTE uh; and Jå, = PTE ye v; (5.10) 


where Tý are su(Ny) generators. What are the analogous expressions in the chiral 


Lagrangian? 
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To answer this, let’s start with SU(N;),. Consider the infinitesimal transformation 
L= nT 1+ iat 
Under this U + L'U so, infinitesimally, 
6,U = —ia°T°U 
We can now compute the current using the standard trick: elevate a° > a%(x). The 


Lagrangian is no longer invariant, but now transforms as 6£ = „a° J$ „; the function 
J} „ is the current that we’re looking for. Implementing this, we find 


Jt, = zee wr(UIT"A, T= (3,0) TU ) (5.11) 
We can also expand this in pion a. (5.6). To leading order we have simply 
Thy & f gu T’ 


Similarly, under SU(Ny)r, we have dU = ue and 
Je, = ifa = (- Tuta,U + (Q,Ut)UT*) = + f +2 a,n" (5.12) 
Note that both currents w non-vanishing matrix elements between the vacuum |0) 
and a one-particle pion state |r°(p)}. For example 
(OL Jz (a) /w'(p)) = i50 pye (5.18) 


Historically, the approach to chiral symmetry Sats was known as current algebra, 
and this equation plays a starring role. It is telling us that the chiral SU(N;)z x 
SU(Ny)p is spontaneously broken, and acting on the vacuum gives rise to the particles 
that we call pions. 


Although the chiral symmetry is broken, the diagonal combination SU(Ny)y sur- 
vives, and 
(JF lm?) = (OJE p + Sault) = 0 
5.2.3 Adding Masses 


Our discussion so far has been for massless quarks. We now consider the effect of 
turning on masses. The Lagrangian is: 


Ny 
1 5 7 7, 
f= =a" FaF” + 3 (iqi Dip; — mipipi) 


If the masses are large compared to Agcp, then the quarks play no role in the low- 
energy physics. Here we will be interested in the situation where the masses are small, 
IN; < Agcp: 
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It is a a general rule — and a deep fact about quantum field theory — that turning 
on a mass for fermions always breaks some global symmetry. In the present case, the 
masses explicitly break the chiral symmetry. If all the masses are equal, then there 
remains a non-Abelian U(Ny)y flavour symmetry. In contrast, if all the masses are 
different, we have only the Cartan subalgebra U(1)*/. 


In the previous section, we saw that we can derive powerful statements about the 
low-energy physics due to the spontaneous breaking of the chiral symmetry. Now 
this symmetry is explicitly broken by the masses themselves, but all is not lost. For 
mMm; < Agcp, we still have an approzimate chiral symmetry. The quark condensate is 
still associated to the scale Agcp, and the masses give only a small correction. This 
means that we can still write 


(Yip) x —o U;j 


with U € SU(Ny). We can then incorporate the masses in the chiral Lagrangian by 
introducing the Ny x Ny mass matrix, 


M = diag(m1,...,mwn;) 


In the presence of masses, the leading order chiral Lagrangian is 


2 
Lo = fe Fr ty (aUt a,v) + Str (MU + U'M") 


This lifts the vacuum manifold of the theory. It can be thought of as adding a potential 
to the SU(Ny) vacuum moduli space, resulting in a unique ground state. To see the 
effect in terms of pion fields, we can again expand U = e?'"/f=, to find 


Ly = tr (On)? — z tr(M + Mt)n? +... (5.14) 


and we see that we get a mass term for the pions as expected. 


5.3 Miraculously, Baryons 


The purpose of the chiral Lagrangian is to describe the low-energy dynamics of pions. 
These are the massless Goldstone bosons that arise after spontaneous symmetry break- 
ing which, in terms of the original quarks take the schematic form ~;1);. These particles 
are all neutral under the U(1)y vector symmetry. 
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There are also bound states of quarks which carry quantum numbers under U(1)y. 
These are the baryons that arise by contracting the a = 1,..., Ne colour indices. 
Schematically these take the form 


Cai ian Vi se Vine (5.15) 


where we have neglected the spinor indices. The baryons are bosons when JN, is even 
and fermions when N, is odd. With our normalisation, they have charge +N, under the 
vector symmetry U(1)y. Often one rescales the charges of the quarks to have U(1)y 
charge 1/N, so that the baryon has charge +1; this re-scaled symmetry is then referred 
to simply as baryon number. 


Assuming that our theory confines, the baryons are expected to have mass ~ Agcp. 
Nonetheless, they are the lightest particles carrying U(1)y charge and so are stable. 


There is no reason to expect that the chiral Lagrangian knows anything about the 
baryons. Indeed, to construct the chiral Lagrangian we intentionally threw out all but 
the massless excitations. It is therefore something of a wonderful surprise to learn that 
the baryons do arise in the chiral Lagrangian: they are solitons. 


The Topological Charge 


Let’s first show that the chiral Lagrangian has a hidden conserved current. Static field 
configurations in the chiral Lagrangian are described by a map from spatial R? to the 
group manifold SU (N+). If we insist that the field asymptote to the same vacuum state 
asymptotically so, for example, 


U(x) 41. as [|x| > œ 


then we effectively compactify R? to S*. Now static configurations can be thought of 
as a map 


U(x): S? = SU(N) 
Such configurations are characterised by their winding 
II3(SU(N5)) = Z 


This winding number — which we denote by B € Z — is computed by the integral 


E 1 
24r? 


J dx eizrtr (Ut (8U) U'(0;U) U' OU) (5.16) 
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In fact, we can go further and write down a local current 


1 
= vpo t t t 
BS Tad Ptr (U'(8,U) U! (8, U) U' 3U) 
which obeys 0,,B" = 0 by virtue of the anti-symmetric tensor. The winding number is 
then given by B = f x B®. 


It is natural to search for an interpretation of this conserved current B”, it terms 
of the microscopic theory. The only candidate is U(1)y, strongly suggesting that we 
should identify B” with the baryon number current and, correspondingly, the solitons 
with baryons. This appears to be magic. We tried to throw away everything that 
wasn’t massless. But if you treat the pions correctly, the baryons reappear as solitons. 


A First Attempt at Solutions 


What do these soliton solutions look like? Let’s start with the two-derivative chiral 
Lagrangian. The associated energy functional for static field configurations is 


f? 
B= | ds trd,U' - ðU 


where now 7 = 1,2,3 runs over spatial indices only. Solutions to the equations of 
motion are minima (or, more generally, saddle points) of this energy functional. A 
simple scaling argument tell us that these don’t exist. To see this, consider a putative 
solution U,(x) with energy £,. Then the new configuration U(x) = U,(Ax) has energy 


2 
E, = l fer tr 0,0) (Ax) - 0,U,(Ax) = >E, 


We see that we can always lower the energy of any configurations simply by rescaling 
its size. This simple observation — which goes by the name of Derrick’s theorem — 
means that although the chiral Lagrangian has the topology to support solitons, no 
static solutions exist. The reason for this is that the classical theory is scale invariant 
so there is nothing to set the size of the soliton. (The only dimensionful quantity, fr, 
multiplies the whole action and so doesn’t affect the classical equations of motion). 


5.3.1 The Skyrme Model 


The situation improves when we include higher derivative terms. These will scale 
differently with A, and may result in a minimum of the energy functional. 
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We saw previously that there are three possible terms with four derivatives (5.9), 
Ly = a (tr “Ut 8U)” + ag (tr 0,Ut 8U) (tr “UT PU) 
+agtr (0,U' 0"U0,U' 0’U) 
and we expect that the effective action contains all three terms with some choice of 
coefficients a1, as and a3. However, it turns out to be much easier to discuss solitons if 


we take a particular linear combination of these terms. We take the effective action to 
be 


2 
L= F tr (O*U'0,U) + sot (UŻU, U'9"U][U',U, U'8,U]) 


This is called the Skyrme model. 


There is no first-principles justification for this particular 4-derivative term although 
it’s worth mentioning that it is the unique term which contains no more than two 
time derivatives, making it more straightforward to interpret the classical equations of 
motion. Here g? is a dimensionless coupling constant that will ultimately determine 
the scale of the soliton relative to fr. 


To simplify our notation, we introduce the su(N¢)z current. 
L, = Ut8,U 


After massaging the four-derivative terms, you can check that the static energy can be 
written as 


fa 
E=% dx tr | LiL! - — aE Gog Tile Gia L 


We now use the Bogomolnyi trick that we already employed in Section 2 for instantons, 
vortices and monopoles: we write the energy functional as a total square, 
Li + wr ee Lk 


E= i feen 3 = In 


an E faa EijkLiLj Lk 
4g 
The first term is clearly positive definite. But the second term is something that we’ve 
seen before: it is the topological winding (5.16) that we identified with the baryon 
number B. We learn that the energy is bounded below by the baryon number 


617 fr 
g 


p> itipj (5.17) 


This now looks more promising: the energy of multiple baryons grows at least linearly 
with B. Soliton configurations with non-trivial winding are called Skyrmions and are 
identified with baryons in the theory. 
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5.3.2 Skyrmions 


Let’s see what Skyrmion solutions look like. The usual way to proceed with bounds 
like (5.17) is to try to saturate them. For B > —0, this occurs when the fields obey the 
first order differential equation 


1 
While this is usually a sensible approach, it turns out that it doesn’t help in the present 
case. One can show that there are no solutions to (5.18). Instead, we must turn to the 
full, second order, equations of motion and solve 
- 4 
—«Af2g? 


We will solve this for the simplest case of 


0, L A,[L,, (L", L] (5.19) 


Ny =2 


Here, the target space = group manifold SU(2) = S°. For a single Skyrmion, the field 
U(x) must wrap once around the S? target space as we move around the spatial R8. 
This is achieved by the so-called hedgehog ansatz, 


Usnyrme(X) = exp (if (r)o - x) = cos f(r) + iø - xX sin f(r) (5.20) 


This field configuration has winding number B = 1 if we pick the function f(r) to have 
boundary conditions 


T as r —> œO 


TERA atr=0 


The equation of motion (5.19) then becomes an ordinary differential equation on f(r), 
(r? +2sin? f) f" + 2rf'+sin2f f? —sin2f — sin? f sin2f = 0 


which can be solved numerically; it is a monotonically increasing function whose exact 
form is not needed for our purposes. The energy of this solution turns out to be about 
25% higher than the bound (5.17). 


Our Skyrme model is built around symmetries. For Ny = 2, the symmetry group 
is SU(2)z x SU(2)r, but if we insist (as we did above) that the field tends towards 
its vacuum value asymptotically, U(x) — 1, then it leaves us only with the diagonal 
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SU(2)y as a global symmetry. Including the group of spatial rotations, we have the 
symmetry group 


SU(2)rot X SU(2)y (5.21) 


The single Skyrmion (5.20) is not invariant under either of these SU(2) groups sepa- 
rately. However, it is invariant under the diagonal SU (2) which acts simultaneously as 
a spatial and flavour rotation. 


The subgroup of (5.21) which acts non-trivially on the Skyrmion solution (5.20) 
can be used to generate new solutions. These are trivially related to the original, 
and just change its embedding in the target space. Nonetheless, they have important 
consequences. After quantisation, they endow the Skyrmion with quantum numbers 
under SU(2)y. For example, one can show that the simplest Skyrmion described above 
sits in a doublet of SU(2)y. In QCD, viewed as having two light quarks, this is 
interpreted as the proton and neutron. 


The Skyrme model has spawned a mini-industry, and there is much more to say 
about its quantisation, and its utility in describing both nucleons and higher nuclei. 
We won't say this here. 


There, however, is one important aspect of Skyrmions that we have not yet under- 
stood: their quantum statistics. Since the baryon (5.15) contains N. quarks, we would 
hope that the Skyrmion is a boson when N. is even and a fermion when JN, is odd. 
Yet, so far, the chiral Lagrangian knows nothing about the number of colours N,. It 
turns out that we have missed a rather subtle term in the effective action, known as 
the Wess-Zumino-Witten term. This will be introduced in section 5.5, and in section 
5.5.3 will see that it indeed makes the Skyrmion fermionic or bosonic depending on the 
number of colours Ne. 


5.4 QCD 


Until now, we have kept our discussion general. However, there is one example of the 
class of theories that we have been discussing whose importance dwarves all others. 
This is QCD, the theory of the strong nuclear interaction. 


QCD is an SU(3) gauge theory coupled to N; = 6 Dirac fermions that we call quarks. 
However, for many questions concerning the low-energy behaviour of the theory, only 
two — or sometimes three — of these quarks are important. To see why, we need to 
look at their masses. (I’ve included their electromagnetic charge Q for convenience) 
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Quark Charge | Mass (in MeV) 
d = down -1/3 4 
u = up +2/3 2 
s = strange | -1/3 95 
c = charm | +2/3 1250 
b = bottom | -1/3 4200 
t = top +2/3 170,000 


Note that the up quark is lighter than the down, an inversion of the hierarchy relative to 
the other two generations. We can compare these quark masses to the strong coupling 
scale, 


Agcp zæ 300 MeV 


We see that the masses of the two lightest quarks mu, Mma K Agcp while the strange 
quark has mass m, < Agcp, although there is not a large separation of scales. Mean- 
while, the other three quarks are clearly substantially heavier than Agcp and play no 
role in the low-energy physics. This means that, for many purposes we can consider 
QCD to have Ny = 3 quarks while, for some purposes, we may want to take Ny = 2. 


When we take Ny = 3, we have several different SU (3) groups floating around. The 
gauge group is SU(3) and the global symmetry group is SU(3); x SU(3)r, which is 
spontaneously broken down to SU(3)y by the chiral condensate. In this section, it is 
these global symmetries that are of interest. 


The global flavour symmetries are not exact because they are broken explicitly by 
the quark masses. The fact that mu ~ mq means that the SU(2)y C SU(3)y subgroup 
which rotates only up and down quarks is a rather better symmetry of Nature than 
the full SU(3)y. This approximate SU(2)y symmetry was first noticed by Heisenberg 
in 1932 and is called isospin. 


Confinement of quarks means that the particles we observe are either mesons (com- 
prising a quark + anti-quark) or baryons (comprising three quarks). These excitations 
must arrange themselves in representations of the unbroken symmetries of the theory. 
As we noted, the global symmetries are not exact due to the different quark masses 
but, as we describe below, are nonetheless visible in the observed spectrum. The fact 
that mesons and baryons arrange themselves into approximate multiplets of SU(3)y 
was first noticed by Gell-Mann, who referred to this classification as the eightfold way. 
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Meson Quark Content | Mass (in MeV) | Lifetime (in s) 
Pion at ud 140 1078 
Pion 7° pui- dd) 135 10716 
Eta 7 qui -+ dd — 255) 548 1" 
Eta Prime 1/ oq (uti + dd + s5) 958 10721 
Kaon K+ us 494 1078 
Kaon K?’ d3 498 1078 — 10-12 
5.4.1 Mesons 


Many hundreds of mesons are observed in Nature!?. A simple model of a meson views 
it as a bound state of a quark and an anti-quark, or some linear combination of these 
states. . Each quark is a fermion, so mesons are bosons and, as such, have integer spin. 
Here we will describe some of the lightest mesons with spin 0 and 1, containing only 
up, down and strange quarks. 


Let’s start with the spin 0 mesons. These are all pseudoscalars, with parity —1. 
A number of these have masses that are lighter or comparable to the proton (which 
weighs in at 938 MeV). These are shown in the table above. 


The + and 0 superscripts tell us the electromagnetic charge of the meson. The 
charged mesons, 7+ and K* both have anti-particles, 7~ and K` respectively. The 
neutral mesons 7°, 7 and 7’ are all their own anti-particles; each is described by a 
real scalar field. Finally, the neutral K? is described by a complex scalar field and its 
anti-particle is denoted K°. The list therefore contains, in total, nine different particles 
+ anti-particles. 


All mesons are unstable, decaying via the weak force. We will describe this briefly in 
Section 5.4.3 but, for now, our interest lies in understanding how these mesons arise in 
the first place. In particular, we would like to understand why this particular pattern 
of masses emerges. 


First, an obvious comment: the masses of the mesons are not equal to the sum of 
the masses of their constituent quarks! This gets to the heart of what it means to be 
a strongly coupled quantum field theory. The mesons — and, indeed the baryons — are 
complicated objects, consisting of a bubbling sea of gluons, quarks and anti-quarks. 
This is what gives mesons and baryons mass, and also makes these particles hard to 


10 All the properties of all the particles in the universe can be found in the Particle Data Group 
website http://pdg.lbl.gov/. 
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understand. Thankfully, for a subset of the mesons, we have the chiral Lagrangian to 
help us. 


Let’s see what we would expect based on chiral symmetry. If we consider QCD 
with just two light quarks — the up and the down — then the spontaneous symmetry 
breaking of SU(2); x SU(2)r symmetry should give us three light almost-Goldstone 
modes. These are the three pions, 7+, 7 and 7°. 


The fact that the pions are both bound states of fundamental fermions, and yet 
can also be viewed as Goldstone bosons, was first suggested by Yoichiro Nambu in the 
early 1960s. His vision is all the more remarkable given that it came 10 years before 
the formulation of QCD, and several years before Gell-Mann and Zweig introduced the 
idea of quarks. Nambu made many further ground-breaking contributions to theoretical 
physics, including the realisation that quarks carry three colours (not to mention writing 
down one of the key equations of string theory). He had to wait until 2008 for his Nobel 
prize. 


Suppose now that we consider Ny = 3 light quarks. We expect N? — 1 = 8 almost 
Goldstone-modes. These are usually referred as pseudo-Goldstone bosons. And, indeed, 
there are eight mesons which are substantially lighter than the others: these are the 
pions, kaons and the 7. They sit inside our 3 x 3 matrix 7 like this: 


T n + + 
1 AT T K 
REE = E 0 

iS 5 T t+ K (5.22) 
K- K å -3 


This is not an obvious arrangement. How do we figure out which particles goes where? 
The answer, as with everything in this game, is symmetry. Our theory has a SU(3)y 
symmetry, which allows us to assign two Cartan charges U(1) x U(1) C SU(3)y to 
each element of the the matrix m. These charges are called “isospin” and “strangeness” 
and coincide with almost-conserved quantities of the particles that can be determined 
experimentally. 


The eight Goldstone modes that sit in 7 would be exactly massless if the SU(3)z x 
SU (3)r were exact. However, chiral symmetry is broken by the quark mass matrix 


M= diag(mu, Mad, Ms) 


Since we’re now dealing with a low-energy effective theory, the masses that appear here 
should be the renormalised masses, rather than the bare quark masses quoted in the 
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earlier table. Equation (5.14) then gives us the pion masses. Expanding this out, we 
find 


—2 Ni, + ma) ((7°)? + 2nta-) + (my + ms) K~ Kt (5.23) 


L mias = 
i, |2 


+ (mq + m,) KOK? + ; (= a me n Se) r on Sma 
Note that there is mixing between 7° and ņ, albeit one that disappears when Mm, = mq 
so that isospin is restored. There is lots of interesting information in this equation. 
First note that we cannot directly relate the quark masses to the meson masses; they 
depend on the unknown ratio o/f?. Nonetheless, there are a number of simple relations 
between meson masses, quark masses and the chiral condensate that we can extract. 
For example, the mass of 7° is given by 


20 
m? = —(m, + ma) 


T 2 
We learn that the square of the pion mass scales linearly with the quark masses. This 
is known as the Gell-Mann-Oakes-Renner relation. 


By taking ratios, we can relate meson and quark masses directly. For example, we 
have 
2 
Migs — Mko Mu — Ma 


(5.24) 


m2 Mu + Ma 


Finally, we can also derive expected relationships between the meson masses. For 


example, we have 3m? +m? = 4 (2(m, +mg) +4m,) If we accept that Mma, ~ mg, then 
we get the relation 
3 1 
mi, x g + qin 


This is known as the Gell-Mann-Okubo relation. Comparing against the experimentally 
measured masses, we have 44, /3m> + mz ~ 480 MeV, which is not far off the measured 
value of mg ~ 495 MeV. 


The n’ Meson 


There is one meson listed in the table that is not a Goldstone boson. This the 7’ which, 
despite having similar quark content to the 7, has almost twice the mass. Note that, 
in contrast to the other eight mesons, 7’ = oq (uu +dd+ s3) is a singlet under SU(3)y. 
This is actually the would-be Goldstone boson associated to the U(1) 4 axial symmetry. 
However, as we have seen, this symmetry suffers from an anomaly, which means that 
the 7/ meson is not massless in the chiral limit, and is not particularly light in the real 
world. 
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The Mysterious Sigma 


There is one light scalar meson listed in the particle data book that I have not yet 
mentioned. It goes by the catchy name of f9(500) and has a mass which is listed as 
somewhere between 400 - 550 MeV. The reason that it’s so difficult to pin down is that 
it decays very quickly — via the strong force rather than weak force — to two pions. 
Moreover, it has vanishing quantum numbers (angular momentum, parity, isospin and 
strangeness). 


Experimentally, its probably best not to refer to this resonance as a particle at all. 
However, theoretically it has played a very important role, for this is the “sigma” after 
which the sigma-model is named. It can be thought of as the excitation that arises 
from ripples in the value of the quark condensate, o = Yy, rather than rotations in the 
quark condensate U. 


5.4.2 Baryons 


We will briefly describe the baryon spectrum in QCD. In the non-relativistic quark 
model, with G = SU(3) gauge group, each baryon contains three quarks. As with 
the mesons, this is a caricature of a baryon which, in reality, is a complicated object 
contains many hundreds of gluons, quarks and anti-quarks, but with three more quarks 
than anti-quarks. This caricature sometimes goes by the name of the non-relativistic 
quark model. 


If we work with Ny = 3 species of light quarks, each transforms in the 3 of SU(3)y. 
We have 


38383=16868610 


A little bit of group theory, combined with the Pauli exclusion principle, shows that 
those baryons which have spin 1/2 must lie in the 8 of SU(3)y. Indeed, there is an 
octuplet of baryons whose mass differ from each other by about 30%. These are shown 
in the table on the next page. 


Similarly, one can show that baryons with spin 3/2 lie in the 10 of SU(3)y. Such a 
decuplet of baryons also exists: they go by the names A (with charges 0, +1 and 2), 
U* (with charges 0 and +1), =* (with charges —1 and 0) and Q7 with charge —1. 


The fact that the baryons sit nicely into representations of SU(3)y was first noticed 
by Gell-Mann who dubbed it the eightfold way. At the time the Q~ baryon — which 
has quark content sss — had not been discovered. Gell-Mann (and, independently, 
Ne’eman) used the representation properties to predict the mass, charge and decay 
products of this particle. 
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Baryon Quark Content | Mass (in MeV) | Lifetime (in s) 
Proton p uud 938 stable 
Neutron n udd 940 10° 
Lambda A° uds 1115 107° 
Sigma Dt uus 1189 107° 
Sigma X° uds 1193 107° 
Sigma X° dds 1197 107° 
Xi =° Uss 1315 10710 
Xi =- dss 1321 1071? 


For the pions, we showed how the mass splitting can be explained from the chiral 
Lagrangian. We will not do this for baryons, although with some work one can show 
that the Skyrmion spectrum indeed gives reasonable agreement. 


5.4.3 Electromagnetism, the Weak Force, and Pion Decay 


It’s not just the quark masses that explicitly break the SU(3)y flavour symmetry of 
the Standard Model; the symmetry is also broken by the coupling to the other forces. 


At low energies, the relevant force is electromagnetism. The U(1)gm of electromag- 
netism is a subgroup of SU(3)y, generated by 


(5.25) 


This is enough to tell us how to couple photons to the chiral Lagrangian. We simply 
need to replace the derivatives in (5.7) with covariant derivatives, 


2 
S = / d‘x ety (D'U' D,U) (5.26) 
where 
D,U = 0,U —ieA,[Q, U] 


with e the electric charge of an electron. 
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At the classical level, this coupling preserves a (U(1) x SU(2))7 x (U(1) x SU(2))p 
subgroup of the SU(3); x SU(3)p chiral symmetry. This means that, if all quark masses 
vanish, the four neutral mesons 7°, 7, K° and K? would still be Goldstone bosons, and 
massless even when we include the effects of electromagnetism. In contrast, the charged 


pions 7* and K= are massless only at tree level. One-loop effects give a contribution 
to their mass of the form ôM y ~ e?tr(QUQU). The charged pion masses in (5.23) 
then become 
2 20 20 
m 


fe E 


By taking ratios of these meson masses, we can cancel the factors of o/ f2 and mbm 


(Mu + Ma) + ôMpuy and mes (ma + Ms) + Mu 


and learn about the quark masses. For example, taking into account electromagnetic 
corrections, we can generalise (5.24) to 


rt — Mo) 


(mix — mo) — (m 


2 
Mo Mu + Ma 


From the measured masses of the mesons, we then get that mg/m, ~ 2. 


Charged Pion Decay 


Although certain pions are relatively long lived — most notably the 7* and the kaons 
— none are absolutely stable. They decay through the weak force. Happily, this too 
is rather straightforward to calculate using the chiral Lagrangian, because the weak 
gauge group coincides with SU (2)z isospin. 


For example, the charged pion t+ = ud has a lifetime of ~ 1078 seconds, decaying 
almost always to 


nt + Vy 
The decay is mediated by the W-boson. If we integrate out the W-boson, we can 
equally well describe the decay using Fermi’s four-fermion interaction, 


Leoni = E [a0 = Pya] [aul — 1) 
where Gp ~ 1075 GeV~2 is the Fermi constant. The computation of the decay rate 
now factorises into two pieces: the leptonic part (fv,,|7,(1—°)v,,|0) can be computed 
perturbatively. However, the piece involving the quarks involve strongly interacting 
physics, (O|juy“(1 — 7°)d|x*). Thankfully we can compute this using the currents that 
we introduced in Section 5.2.2. The operator coincides with the SU (2), current (5.10), 
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We can then use our result (5.13), 


Sn 


(O12 ,(0)|n"(p)) = i225" ppe? 


We simply need to remember that nt = %(7' + in?) to find that the matrix element 


Sis 


is determined by fyr, 
(Olay"(1 — 7°)d|at) = 1/2 f, phe" ?* 


Recall that when we first introduced f, in Section 5.2, we mentioned that it is called 
the pion decay constant, even though that name made little sense in the theory we were 
considering. Now we see why: it is the scale which directly determines the decay width 
of the pion. 


To compute the lifetime of the pion, we must square the matrix element and integrate 
over the phase space of fi and v,,. The end result for the rate of decay is then given by 


G2, f2 ae 
T(nt > a+ vp) = m,m? (1 — Te) 
T m2 


Neutral Pion Decay 


0 


The neutral pion, 7° = ga (tu —dd) has a substantially shorter lifespan that its charged 


cousin. It lasts only around ~ 10~!° seconds, decaying primarily to 
nT > yy 


There is an interesting story associated to this. Indeed, it was the effort to understand 
why this decay occurs at all that first led to the discovery of the anomaly. 


The full history is, as with many things in this subject, rather convoluted. The 
pion decay was first computed in the 1940s, by assuming a coupling to the nucleons 
N = (p,n) of the form G,zyr*Ny°o°N. This gives a result which is pretty close to 
the observed value. Unfortunately, this calculation is wrong. As we’ve seen, the pion 
is really a Goldstone boson and so has only derivative couplings, at least in the limit 
Mr — 0. Indeed, one can show that in a theory with an unbroken SU(2);, x SU(2)p 
chiral symmetry, the decay 7° — yy would be forbidden. What’s going on? 


The answer is that we’ve missed something. Gauging a subgroup U (1)gm C SU(2)yv 
introduces an anomaly for the axial currents. We can import our calculation of the 
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chiral anomaly from Section 3.1. For two quarks, up and down, each with N. = 3 
colours, we have 


a N. vpo g? 
Tip = Ted P Papel aa tr (Ze) 


where here F,» denotes the electromagnetic field strength. In contrast to (5.25), we 
now take U(1)gm C SU(2)y to be generated by 


2 0 
— 3 
ami) 


Only the a = 3 component of the current is non-vanishing, with 


Ne vpo 
Tip = E d FF ge 


But this is precisely the current which, from (5.13), creates the neutral pion 7°, with 


(O|J3,\7°) = —ifxppe*?. The anomaly equation then gives an amplitude for 7° > 
yy. This amplitude is the same as that which would arise from the coupling in the 
Lagrangian 
Ne? 0 _uvpo 
= 06n2f." suas ry ah (5.27) 


Note that the decay amplitude is proportional to Ne, the number of colours. Comparing 
to the experimental data provides a way to determine Ne = 3. (Actually, this is a little 
bit quick because the U(1) charge assignments above are fixed, in part, by anomaly 
cancellation which, as we saw in Section 3.4.4, changes if we change Ne.) Above we 
have used just two quarks, Ny = 2, but we get the same results using Ny = 3 if we 
correctly identify the current producing 7° from within the matrix (5.22). 


We have argued that the anomaly means there must be an effective coupling of the 
form (5.27). Yet there’s something odd in this, because if we expand out the action 
(5.26), no such term arises. Indeed, naively this term appears to contradict the ethos of 


° isn’t obviously derivatively coupled, 


this whole section, because the Goldstone boson 7 
which seems very unGoldstonelike. Nonetheless, it would be nice to be able to write 
down a low-energy effective action that correctly captures the anomaly, rather than 


adding it in by hand. It turns out that there is a beautiful way to achieve this. 
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5.5 The Wess-Zumino-Witten Term 


We have argued that, at low-energies, the dynamics of the Goldstone modes is captured 
by the chiral Lagrangian 


2 
s= fe i diz tr(ð, UU) (5.28) 


We also briefly discussed in Section 5.2.1 the higher order terms that we could add to 
this action to improve its accuracy as we go to higher energies. It turns out, however, 
that this misses one very important term, one which, among other things, accounts for 
the anomaly. This is known as the Wess-Zumino-Witten term. 


To motivate the need for an extra term, let’s look more closely at the discrete sym- 
metries of the chiral Lagrangian (5.28). They are: 


e Charge conjugation, C : U =œ U*. 
e “Naive parity”, Pp: x > —x with t > t and U > U. 


e An extra symmetry: U — U?. In terms of the pion fields (5.6) 
2i 2i 
U = exp (> T°) =14 T +... (5.29) 
tr Ír 

this symmetry acts as m° +> —7*. In other words, it counts pions mod 2. For this 
reason, we denote the symmetry as (—1)** where N, is the number of pions. 


However, these are not all symmetries of the underlying QCD-like gauge theory. Indeed, 
the pions and other Goldstone bosons in QCD are pseudoscalars, meaning that they 
are odd under parity. The correct parity transformation should be 


P = P)(-1)* 


It is unusual — although not unheard of — to have a low-energy theory which enjoys 
more symmetries than its high-energy parent. It might lead us to suspect that we’ve 
missed something. Are we really sure that there are no terms that we can add to (5.28) 
which violate both P) and (—1)4*, leaving only P as a symmetry? 


It is simple to look through the higher derivative terms (5.9) that we met before and 
convince yourself that they all preserve both P) and (—1)*7. Indeed, the way to get 
something that violates P) is to use the anti-symmetric tensor "°. But if we try to 
form a four-derivative term in the action from this, we would have 


cl¥PO ty (ut (a,U)Ut (a,U)ut (a,u)ut (3,0) =i (5.30) 
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Figure 45: Integrating over S... Figure 46: ...or over S’. 


and, as shown, this vanishes by anti-symmetry. You can also consider higher derivative 
terms and see that they too preserve all these discrete symmetries. There’s no way to 
construct terms in the action that violate Pp. 


However, the story is rather different if we work with the equation of motion. The 
equation of motion arising from (5.28) is 


1 
f2 Ə, (UŻU) = 0 
We could add to this the term 
1 k 
5 f20,(UTOHU) = im U1 (0,U)Ut (0,U)U' (0,U)U' (0,U) (5.31) 


where k is some constant which we will fix shortly and the normalisation of 487? is 
for later convenience. This is the famous Wess-Zumino-Witten term, first introduced 
in this context by Witten. Despite our feeble attempts above, it turns out that there 
is a way to write an action for this term, but not if we restrict ourselves to actions in 
four-dimensions! 


5.5.1 An Analogy: A Magnetic Monopole 


A useful analogy can be found in Dirac monopoles. This is a story that we’ve already 
met in Section 1.1. Consider a particle of mass m and unit charge moving in R? in the 
background of a Dirac monopole. The equation of motion is 


ML; = ACH RCSLE 


with À a constant which determines the strength of the monopole. This system shares 
some similarities with our discussion above. First, the left-hand side is invariant under 
two discrete symmetries: time reversal t + —t and parity x; œ> —2x;. However, the 
term on the right-hand side is not separately invariant under both of these, but only 
if we do both at once. Furthermore, the equation of motion is invariant under SO(3) 
rotations. 
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Can we construct an action for this equation of motion? If we try to do so preserving 
the SO(3) rotational invariance, we run into trouble because obvious term that we 
might try to write down to reproduce the right-hand-side is ¢’*x;x;<, = 0 by anti- 
symmetry. This, of course, is analogous to (5.30). However, this doesn’t mean that no 
action exists. In fact, there are two possibilities. One is to introduce a gauge potential 


A;(x) and write down the action 
1 f 
S= f dt <ma? + \.A;(x)z" 
c 2 


where C is the worldline of the particle. An example of such a gauge potential was 
given in (1.5). This approach has two problems: the gauge potential necessarily breaks 
the SO(3) symmetry, which is no longer manifest in the action; and the gauge potential 
necessarily suffers from a Dirac string singularity. 


We can circumvent both of these problems simply by using Stokes’ theorem. Suppose 
that we take C to be a closed path. We then write 


f HAE SR i: dS" F(z) (5.32) 


where S is a two-dimensional disc, with boundary 0S = C, as shown in the figure. 
Now things are much nicer. The field strength Fi; = €;;,2"/|x|? is both SO(3) invariant 
and, away from the origin, non-singular. However, the price that we paid is that the 
action is written in terms of a two-dimensional surface, rather than the one-dimensional 
worldline. 


There is one further problem with the action (5.32) because, as we saw in Section 1.1, 
there is an ambiguity in the choice of surface S. There is another surface S’, with the 
opposite orientation, that also does the job. For the path integral to be well-defined, 
we require that these two options give the same answer. We must have 


exp (a | dt Ac’) = exp (a f astrs) = exp (-a f as" Py) 
G S j 


Stitching together the two discs gives the closed two sphere S?. The condition can then 
be written as the requirement 


exp (a a8" Fy) = exp (af sy) =] (5.33) 
sus! s2 


However, the magnetic flux through any closed surface is quantised, with the minimum 
flux given by fg dS% Fij = 4r. We see that the path integral is consistent only if 


1 
eZ 
E2 


=20 s 


This is simply a restatement of the Dirac quantisation condition that we already met 
in Section 1.1. 


5.5.2 A Five-Dimensional Action 


With the discussion of the magnetic monopole fresh in our minds, let’s now return to 
the chiral Lagrangian. We would like to ask if there is some action which respects the 
SU(Ny)t x SU(Ny)r symmetry of the chiral Lagrangian and reproduces the term on 
the right-hand-side of (5.31). The answer is yes, but it can only be written by invoking 
a fifth dimension. 


We will work in the Euclidean path integral and the argument is simplest if we take 
our spacetime to be S*. We introduce a five-dimensional ball, D, such that 0D = S4. 
We extend the fields U(x) over S* to U(y), where y are coordinates on the ball D. We 
can then reproduce the equation of motion (5.31) from the action 


2 
S= ff aa tr(ð UTU) + kf ay wW (5.34) 
where 
ðU _,0U _,0U _.0U ðU 
-EE HVpoT ii ij i i i] 
aa re (U oe ae a a ae (5.35) 


This is the Wess-Zumino-Witten (WZW) term. There are a few things to say about 
this. First, it is manifestly invariant under the SU(N;); x SU(N») x chiral symmetry. 
Second, it naively appears to depend on the choice extension of U(x) to the five- 
dimensional space U(y), but this is an illusion. The equations of motion computed 
from the action I depend only on U(x) restricted to the boundary S*. There are a 
couple of ways to see this. A somewhat involved calculation shows that the variation 
of I is indeed a boundary term. Alternatively, we can expand U in the pion fields as 
in (5.29), 


m 
uta,U = Foun + O(n?) 


Then 


2 
i dy w = aan | dy ef¥POT O tr (70,00 ,n0q70-m ) + O(r°) 
D nt JD 


= aE L dx e’ tr (rð,rðprðarð;r) + O(n) 
Written in this form, the SU(N;); x SU(Ny)z symmetry is no longer manifest. This 
is entirely analogous to the lack of manifest rotation symmetry in the Dirac monopole 
connection. Nonetheless, since it came from the term (5.34), the symmetry must be 


there, albeit hidden. 
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We see that the new term gives a five-point interaction between Goldstone modes. 
In the context of QCD, this mediates the decay K+ + K~ — mt +m + 7°, which 
explicitly breaks the (—1)‘* symmetry of the original chiral Lagrangian. 


Quantisation of the Coefficient 


Just as for the Dirac monopole, there is an ambiguity in our choice of five-dimensional 
ball D with OD = S*. We could just as well take a ball D’, also with OD’ = S* but 
with the opposite orientation. We can now make the same kind of arguments that, in 
(5.33), gave us Dirac quantisation. We have 


exp (i dy w) = exp (—i« f dy w) 
D / 


Stitching together the two five-dimensional balls now makes a five-sphere: DUD’ = S5. 
For our path integral to make sense, we must have 


exp (i dy w) al (5.36) 
s5 


By now it’s probably no surprise to learn that there’s some pretty topology that un- 
derlies this formula! The integrand provides a map from S° to the group manifold 
SU(N), parameterised by U(y). Such maps are characterised by the fifth homotopy 
group, 


I5(SU(N))=Z for N>3 


This means that as long as we have Ny > 3 flavours, each map can be assigned a 
winding n € Z. It turns out that this winding is computed by 


f dy w = 2mn 
s5 


The quantisation condition (5.36) is then satisfied providing 
kez 
This leads us to our next question. What is k? 


Rediscovering the Anomaly 


The Wess-Zumino-Witten term is closely related to the chiral anomaly. This, it turns 
out, will give us a strategy to determine the integer k. 


= 22s 


Here is the plan. We will gauge a U(1) subgroup of SU(Ne)aiag C SU(Ny)z X 
SU(Nr)r. To do this, we introduce a charge matrix Q, as in (5.25), and promote the 
derivatives in the chiral Lagrangian to covariant derivatives 


2 
5 = pes Fre, (D"UŻ D,U) + Swzw 


with D,U = 0,U — ieA,|Q,U]. However, we also need to find a way to make the 
Wess-Zumino-Witten term gauge invariant. It’s tempting to just do the same trick, 
and promote 0,,U to D,,U in (5.35). But this isn’t allowed because the resulting action 
now depends on what’s going on in five dimensions. Any gauging must take place only 
in four dimensions. 


To proceed, we first look at how the WZW term changes under an infinitesimal trans- 
formation dU = ia(x)[Q,U] where, here a(x) depends only on the four-dimensional 
coordinates. We have, schematically, 


6(U'OU) = ialQ, UtƏU] + idaUT[Q, U] 


The variation of the 5-form w defined in (5.35) has terms of order 0a”, with n = 
0,1,...,5. Of these the n = 0 term vanishes by cyclicity of the trace, while the 
n = 2,3,4,5 terms vanish by the anti-symmetry of the e#”°7* symbol. After judicious 
use of the identity UOU = —(OU')U, we find 


dw = (ða) J" 
where the current J” is given by 
je = _ cry (tQ, ð,Ut}ð,U UtƏ,U UOU) 
= — eHYPONG ty (tQ, Ut} a,U uta,U UOU) 


where you need to work a little bit to check that the extra terms that you get from 
acting with ô, vanish by anti-symmetry. Because the current is a total derivative 
(and because ða depends only on the four-dimensional coordinates), the variation of 
f 5 dřx w reduces to a boundary term and, at leading order, can be cancelled by the 
variation of the four-dimensional gauge field 6A, = O,a/e. This means that we can 
introduce the gauged WZW term 


Swzw =k J dz w — e fats aa) 
D 


=2s 


with the four-dimensional current given by 
1 
7 or t t i 
Poe (tQ, Ut} a,U Uta, U U aU) 


However, it turns out that we’re still not done. To get a fully gauge-invariant action, 


we need to work to one higher order in the gauge coupling e. Here we simply quote the 
result: the fully gauge invariant WZW term is given by 


Swzw =k J dr w— e f a's Ayla)" 
D 

ie? 

24r? 


$ f dêz e” (8 A,)Aptr (e, Ut} ð, U + viquau'a,v) 
How does this help us determine k? To see this, we need to expand out this action in 
terms of pion fields. For simplicity, let’s do this for Ny = 3 quarks, with the charge 
matrix (5.25) appropriate for QCD. Among the order e? terms from above, there sits 


E ke? 
96T? fy 


eE I Y 


But we’ve seen this before: this is the term which captures the anomaly (5.27). To 
agree with the anomaly, the integer k must be equal to the number of colours 


k=N. 


This is a beautiful result. Until now the chiral Lagrangian has appeared to be inde- 
pendent of the gauge group SU(N,); all that was needed was for the gauge dynamics 
to initiate chiral symmetry breaking and then it seemed that it could be forgotten. We 
see that this isn’t quite true: a memory of the underlying gauge group survives as the 
coefficient of the WZW term. 


5.5.3 Baryons as Bosons or Fermions 

We saw in section 5.3 that the chiral Lagrangian provides a lovely and surprising new 
perspective on baryons: they are solitons, constructed from topologically twisted pion 
fields. The conserved baryon current is identified with the topological current 


B” = ween (U1(0,U) Ut (Ə, U) Ut ,U) 


and the 
This winding number — which we denote by B € Z — is computed by the integral 


1 
i An? pez Eijktr (UT(0,U) Ut (0;U) Ut ôU) 
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However, there was something lacking in our previous discussion. From the underlying 
quarks, we know that baryons should be bosons when JN, is even and fermions when 
Ne is odd. How is this basic fact reproduced in the chiral Lagrangian? Here we show 
that, for Ny > 3, the Wess-Zumino-Witten is exactly what we need. 


We focus on Ny = 3. (The story is basically unchanged for higher Ny.) Consider a 
static Skyrmion of the form (5.20) embedded in the SU(3) matrix U as 


n= (oot) 


We wish to compare the amplitude for two different processes to occur over some long 
time T. In the first process, the soliton simply sits stationary in space. In the second 
process, we rotate the soliton by 27 slowly about its origin. The first process has 
amplitude e‘”7, where E is the energy of the soliton. We have to work a little harder 
to compute the amplitude for the second process. There are two contributions from the 
two different terms in the chiral Lagrangian (5.34). The first of these comes from the 
usual kinetic term. Since this involves two time derivatives, it will contribute a piece of 
order ~ 1/T which can be ignored in the T — oo limit. In contrast, the WZW term is 
linear in time derivatives and will contribute a constant piece. This is what we want. 


Here we sketch the calculation. We saw in section 5.3 that the Skyrmion is invariant 
under a simultaneous spatial and isospin rotation. This means that we can swap our 
rotation in space for a flavour rotation. A suitable configuration is given by 

eitt/T 


e7itt/T 


U(x,t)= ev int/T Uo(x) eint/T 
1 1 


We must then extend this configuration over the 5-dimensional ball D and compute 
the integral 


i OU ðU ðU ðU ðU 
t= dy Ttr | Ut ut T j T 
2407? [ i i ( oy”. Oy” g Oy? 4 Oy? fe 


One finds 
T=7 


This is what we needed. It means that the amplitude for a soliton which rotates by 27 
is not e’”7 but is instead 


ciET piNew _ (_1)NegiBT 
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The factor of (—1)° is telling us that these solitons are bosons when N, is even and 
fermions when Ne is odd. 


Baryons when N; = 2 


When Ny = 2 there is no WZW term. This means that the chiral Lagrangian does 
not know about the underlying number of colours Ne. Nonetheless, there is a new 
ingredient. This arises because 


Ily(SU(2)) = Zs (5.37) 


while I1,(SU(N)) = 0 for N > 3. Note that this is the same homotopy group that 
arose in the non-perturbative anomaly described in section 3.4.3. 


If we work in compactified Euclidean spacetime, then any field configuration in the 
chiral Lagrangian is a map from S* to SU(2) and so is labelled by v = +1. This gives 
us different options for the path integral. We could either weight all configurations 
equally, or weight them with a factor of (—1)”. These should be thought of as two 
different theories which, in analogy with section 2.2, could be said to be distinguished 
by a “discrete theta parameter” 0 = 0 or 7. 


Here is an example of a field configuration with v = —1: create a soliton-anti-soliton 
pair from the vacuum, rotate one around the other, and then annihilate them again. 
In the theory with 0 = 0 this configuration is not weighted any differently and the 
solitons are bosons. In the theory with 6 = r, this configuration is weighted with an 
extra factor of —1. Here the solitons are fermions. 


We learn that in the theory with Ny = 2, we have a choice: we can either quantise the 
solitons as a boson or as a fermion. This choice arises as an extra discrete parameter 
which we must stipulate to fully define the path integral. 


5.6 °t Hooft Anomaly Matching 


Until now, our strategy has been to assume that the quark condensate (5.4) forms and 
then explore the consequences. Our justification for the condensate itself was rather 
flimsy. In this section we will improve slightly on this state of affairs. While we will 
not give a proof that the condensate forms, we will show that it is implied by another, 
well-known effect of strongly coupled gauge theories: confinement. To show this, we 
will use the ’t Hooft anomaly matching arguments of Section 3.5 
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5.6.1 Confinement Implies Chiral Symmetry Breaking 


By now the global symmetry group of G = SU(N.) gauge theory with Ny quarks should 
be very familiar: it is 


Gr =U(l)y x SU(Ny)t x SU(Np)R 


This group has a ’t Hooft anomaly which, at high energies, arises from the quarks. If 
the theory confines, this anomaly must be reproduced by massless bound state fermions 
in the infra-red. The essence of the argument is that no such bound states can exist. 


Let’s first compute the ’t Hooft anomalies in the ultra-violet, where the quarks con- 
tribute. There is no anomaly for [U(1)y]®, but there are anomalies for both [SU(N p] 
and [SU(N,)z]? x U(1)v, together with the corresponding anomalies for SU(Ny)r. We 
have 


[SU(Ny)z]* : Ay = Ne (5 38) 
[SU(N]? x U(l)v: A= Ne l 


where, in both cases, A = Ne is counting the number of colours of the quarks. 


What about in the infra-red? Confinement means that the quarks bind to form 
colour singlets. Our task is to figure out how the resulting states transform under the 
flavour symmetry Gr. Here the details depend on the choice of gauge group. When 
N. is even, both mesons and baryons are bosons so there are no solutions to the ’t 
Hooft anomaly conditions. This is a striking result. It tells us that there is no way to 
form massless bound states which match the anomaly. For the theory to be consistent, 
it must be that Gy is spontaneously broken in the infra-red. The simplest possibility 
is that the symmetry is broken down to its vector-like subgroup which is free from 
anomalies. This, of course, is the pattern of chiral symmetry breaking (5.5) that arises 
from the quark condensate. 


Fermionic Baryons 


When the number of colours Ne is odd the baryons are fermions. Now we have to work 
a little harder. Is it possible that these baryons are massless and match the anomaly? 
To proceed, we will restrict attention to the simplest case of 


Ne=s 


The arguments that follow can be generalised to arbitrary SU(N.) gauge group. 
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If the gauge group confines, then any massless fermion must be a colour singlet. The 
only possibility is baryons, comprised of three quarks. Each constituent quark can be 
either left-handed or right-handed. Under SU(Ny), x SU(Ny)r C Gr, the left-handed 
fermions transform as (N+, 1), while the right-handed fermions transform as (1, N+). 
Both of these Weyl fermions have charge +1 under U(1)y. The putative massless 
baryons therefore transform under the Gp flavour symmetry in representations given 
by the Young diagrams, 


i 
Tm. |. Herel. Dek» ur (5.39) 


What are the helicities of these baryons? We can take a pair of left- or right-handed 
fermions and form a Lorentz scalar €p apg where, for once, we’ve explicitly written 
the a, ß spinor indices. This means that it’s possible to contract the spinor indices 


such that each baryon above is left-handed. Similarly, if we replace | Į | with |r | then 


we have the possible set of right-handed baryons 


r 
l 
rier] > Ho Wema > kem o or (5.40) 
r r 


These have opposite helicity of the representations in (5.39). The [U(1)y]? anomaly 
remains trivially satisfied if the spectrum of massless baryons is vector-like so we will 
assume that if a massless baryon of the type (5.39) arises, then its counterpart in (5.40) 
also arises. 


Since we’re dealing with a strongly coupled theory, how can we be sure that the 
indices are contracted so that (5.39) are left-handed and (5.40) are right-handed? First, 
there is a theorem by Weinberg and Witten which says that one cannot form massless 
bound states with A > 1. So if the massless baryons above do indeed form then they 
must have helicity +3. But is it possible to dress these baryons with gluons which shift 
their helicity by +1? 


To be on the safe side, we, we associate an index, pa E Z, with a= 1,...,5 to each 
of the five baryons in (5.39). The magnitude |p,| denotes the number of species of 
baryon that arise in the massless spectrum. If these baryons are left-handed then we 
take pa > 0; if they are right-handed then we take pa < 0. Our task is to find which 
values of pa will satisfy anomaly matching and reproduce (5.38). 
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Next, we need a little group theory. For a representation R of SU(N), we will need 
to know the dimension dim(R), the anomaly coefficient A(R), as well as the Dynkin 
index a(R), 


1 
tr T'T’ = ze)” 


For each of the representations of interest, we have 


R dim(R) u(R) A(R) 
Ny 1 1 

5Ny(Ny + 1) Ny+2 Ny+4 

aN s(Ny — 1) Ny —2 N;—4 


ENN; + 1)(Ny +2) | (N; +2 (N; +3) | EN; +3)(N; + 6) 


aN (Ny — 1)(N; - 2) E(N; — 2)(N; — 3) E(N; — 3)(N; — 6) 


aNe(N7 -— 1) N? -3 N; -9 


We can now compute the infra-red anomalies, assuming that we have pa massless 
baryons of each type. For [SU(N+)z]? with Ny > 3, the anomaly is 


Ay = (Mp + SMM y+ Opa + SN = 3)(Ns — Opa + (SNG(Ny + 1) — Nyy +4) po 


$ G (Ns = 1) = Ny(Ny ~4)] pa + (NF — 9)ps 


Note that the baryons with numbers p3 and p4 arise from tensor products and have two 
terms. For example, for p3 the first term comes from the left-handed baryon | | |®|_r |r |, 
and the second — with the minus sign — from the right-handed baryon |r|@/ LIL]. 


~ 


Meanwhile, for the [SU(N;)?] x U(1)y] anomaly, each baryon has charge 3 under the 
U(1)y. Dividing through by this, we get a contribution proportional to the Dynkin 
index u(R), 

A 1 


2 = SON +N + Bou + SN — 2)(Ns — 3)pa + (SNG(Ny + 1) = NANG +2) ) pa 
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4 G (Np ~1) ~ Ny(Ny —2)) pa + (NF — 3)ps 


To match the anomalies, we need to find pa such that A, = A» = 3. 


To start, let’s look at Np = 3. Anomaly matching gives 
Ag 
A, = 27p, — 15p = 3 and a 15pı — 9p3 + 6ps = 1 


We can immediately see that there can be no solutions to the second of these equations 
since A»/3 in the infra-red theory is necessarily a multiple of 3 and cannot reproduce 
the ultra-violet anomaly A,/3 = 1. We learn that Œ = SU(3) gauge theory with 
Ny = 3 massless fermions must spontaneously break the Gp flavour symmetry, as long 
as the theory confines. You can check that the same argument works whenever Ny is 
a multiple of 3. 


Decoupling Massive Quarks 


When JN; is not a multiple of 3, things are not quite so simple. Indeed, we will need 
one further ingredient to complete the argument To see this, let’s look at the anomaly 
matching conditions for G = SU(3) gauge theory with Ny = 4 flavours. They are: 


A, = 35p) — po — 22p3 + 6p, + Tps = 3 


A 
= = 21pı + po — 14p3 — 2p, + 13p5 = 1 


Now there are solutions. For example pọ = 3 and p4 = 1 with pı = p3 = ps = 0 does 
the job. This corresponds to four massless baryons in the representations 


[3(4, 1) @ (4, 6)], © [3(1, 4) @ (6, 4)] 2 (5.41) 


where the L and R subscripts denote the chirality of these Weyl spinors. Note that 
the left-handed baryons now transform under both SU(4), and SU(4)pr of the chiral 
flavour symmetry. 


Naively, the existence of the solution (5.41) suggests that there is a phase with 
massless baryons and the chiral symmetry left unbroken. In fact, this cannot happen. 
The problem comes when we think about giving one of the quarks a mass. We will 
make the following assumption: when we give a quark a mass, any baryon that contains 
this quark will also become massive. It is not obvious that this happens, and we will 
have to work harder below to justify this. But, for now, let’s assume that this is true 
and see where it leads us. 
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If we give one of the quarks a mass, then the symmetry group is explicitly broken to 
Grp = U(1)v x SU(4), x SU(4)R —? Gy = U(1)y x SU (3), x SU(3)R 


What happens to our putative massless spectrum (5.41)? A little group decomposition 
tells us that under G+, the left-handed baryons transform as 


3(4,1) > 3(3,1)@3(1,1) and (4,6) > (3,3) 6 (3,3) @ (1,3) @ (1,3) 


The right-handed baryons have their SU(3), x SU(3)r representations reversed. Of 
these, the (1,1) and the (3,3) do not contain the massive fourth quark. By our 
assumption above, the remainder should become massive. 


There is a further constraint however: all of the baryons that contain the fourth 
quark should become massive while leaving the surviving symmetry G‘, intact. This 
is because as the mass becomes large, we should return to the theory with Ny = 3 
flavours and the symmetry group G‘,. Although we now know that G'p will ultimately 
be spontaneously broken by the strong coupling dynamics, this should happen at the 
scale Agcep and not at the much higher scale of the fourth quark mass. 


So what G‘,-singlet mass terms can we write for the baryons that contain the fourth 
quark? The left-handed spinors transform as 3(3, 1) @ (3, 3) 6 (1, 3) 6 (1,3). Of these, 
(3,3) can happily pair up with its right-handed counterpart. Further, one of the (3, 1) 
representations can pair up with the right-handed counterpart of (1,3). But that still 
leaves us with 2(3,1) @ (1,3) and these have nowhere to go. Any mass term will 
necessarily break the remaining G‘, chiral symmetry and, as we argued above, this is 
unacceptable. 


The result above should not be surprising. Any baryon that can get a mass without 
breaking G’p does not change the ’t Hooft anomaly for G'p. If it were possible for all 
the baryons containing the massive quark to get a mass without breaking G’, then the 
remaining massless baryons should satisfy anomaly matching. Yet we’ve seen that no 
such solution is possible for Ny. 


The upshot of this argument is that there exists no solution to anomaly matching 
for Ns = 4 which is consistent with the decoupling of massive quarks. It is simple to 
extend this to all Ny and, indeed, to all Ne. °t Hooft anomaly matching then tells us 
that the chiral symmetry must be broken for all Ne > 2 and all Ny > 3. 
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5.6.2 Massless Baryons when N; = 2? 


There is one situation where it is possible to satisfy the anomaly matching: this is when 
Ny = 2. Since there is no triangle anomaly for SU(2), we need only worry about the 
mixed [SU(2);]*? x U(1)y ’t Hooft anomaly. We can import our results from earlier, 


although we should be a little bit careful: the anti-symmetric representation = is the 


singlet of SU(2) while the representation i l does not exist. The ’t Hooft matching 
condition for gauge group SU(3) now gives 

A 

F = 10p: —Sps +p =1 


This has many solutions. The simplest possibility pı = p = 0 and p4 = 1. This means 
that we can match the anomaly if there are massless baryons which transform under 
SU (2), x SU(2)R x U(1)y as 


(2,1)3 @ (1, 2)3 (5.42) 


So for Ny = 2 we cannot use ’t Hooft anomaly matching to rule out the existence of 
massless baryons. But it does not mean that they actually arise. To understand what 
happens, we need to look more carefully at the actual dynamics. The only real tool we 
have at our disposal is the lattice and this strongly suggests that even for Ny = 2 the 
chiral symmetry is broken and there are no massless baryons. 


But what if... 


Although the lattice tells us that the chiral symmetry is broken for Ny = 2, it is 
nonetheless an interesting exercise to understand better how we could have ended up 
with a massless baryon. The story that we will find has a nice twist and — as we will 
see in Section 5.6.4 — turns out to be realised in other contexts. 


To start, let’s return to our calculation of the classical force between quarks. We saw 
in Section 2.5.1 that a quark and anti-quark attract in the singlet channel and repel in 
the adjoint. This played a role in our initial discussion in Section 5.1 of why a quark 
condensate (py) might form in the first place. 


However, we also saw in Section 2.5.1 that the two quarks attract in the anti- 
symmetric channel and repel in the symmetric channel. We might wonder if it’s possible 
to form a condensate of quark pairs, rather than quark-anti-quark pairs. Such a con- 
densate would break the gauge group. 
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In more detail, for Ne = 3 and Ny = 2 the initial gauge and global group of the 
theory is G = SU(3)gauge X SU (2)z x SU(2)r x U(1)v. The quarks transform as 


p- : (3,2,1) and y4: (3,1,2) (5.43) 

For Np; = 2, a condensate of quarks can take the form 
(Ws es) = (we eb? ;) = =h ejo. (5.44) 
Here the spinor indices are contracted so that the condensate is Lorentz invariant. The 
use of é means that the condensate is also invariant under the global SU (2)z x SU (2)r 


chiral symmetry. However, since the condensate o, transforms in the (3®3)anti-sym = 3 
of SU (3), it breaks the gauge symmetry 


G= SU (3) gauge => SU (2) gauge 


where we’ve added the “gauge” label because the number of different SU(2) groups is 
about to get confusing. Naively it looks like the condensate (5.44) also breaks the U(1)y 
symmetry, but this can be restored by combining it with a suitable U(1) C SU(3) gauge- 
For example, if we take oe = 06, then the generator 


Qy = Qv + diag(2, =, —1) gauge 


is unbroken and commutes with SU (2)sauge. This means that, at low-energies, our 
theory has the symmetry 


G =SU(2) ae X SU (2)z x SU(2)r x U(1)y 


How do the quarks (5.43) transform under G”? A little bit of representation decompo- 
sition shows 


yY- : (1, 2, 1); D (2, 2, 1)o and Wy : (1, 1, 2)3 D (2, 1, 2)o 


The existence of the condensate can be thought of as giving mass to the fermions that 
sit in the 2 of SU (2)sauge. (Note that, as in the condensate (5.44), we can form a singlet 
from 282 so there’s no problem with either gauge invariance nor chiral symmetry.) But 
those fermions that are singlets under SU (2) sauge are protected from getting a mass by 
the surviving U(1), chiral symmetry. The curious fact is that these massless fermions 
sit in precisely the representations (5.42) which satisfy ’t Hooft anomaly matching. 


There’s something rather odd about this. In the ’t Hooft anomaly matching argu- 
ment, we assumed that the theory confines and looked for massless baryons — composites 
of three underlying quarks. In the analysis above, however, we proposed that the quark 
condensate Higgses the gauge group and the massless fermion is just a single quark, 
albeit with U(1), charge +3. 
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In fact, these are two different ways of looking at the same underlying physics. In 
the presence of the condensate (5.44), the vacuum is filled with pairs of quarks which 
can mix with the lone massless quark to form the composite baryon. Moreover, as 
we saw in Section 2.7.3, when we have a scalar in the fundamental representation — 
here played by the condensate ww — there is no distinction between the Higgs and 
confining phases. The two descriptions — in terms of massless baryons or in terms of 
a condensate Higgs field — use different words, but are telling us the same thing. This 
situation sometimes goes by the rather pretentious name of complementarity (a much 
overused word in physics, and one which is possibly better saved for other, more subtle, 
phenomena). 


As we mentioned above, it appears that the scenario sketched here doesn’t occur for 
QCD-like theories with Ny = 2, presumably because the condensate which breaks chiral 
symmetry is preferred for more subtle, dynamical reasons. Nonetheless, something 
similar does happen for chiral gauge theories. 


5.6.3 The Vafa-Witten-Weingarten Theorems 


To invoke the full power of ’t Hooft anomaly matching, we needed to assume that any 
baryon that contains a massive quark is itself massive. This is not at all obvious in 
a strongly interacting theory of the kind we’re dealing with. When the mass of the 
quark is very large, m >> Agcp, it is certainly true that the baryon must be massive. 
But for small quark masses m < Agcp, we could well imagine a situation where the 
binding energy cancels the quark mass, resulting in a massless bound state that contains 
massive constituents. 


Two possibilities are depicted in Figure 47. The first shows the mass of the baryon 
increasing monotonically with the constituent quark mass. This is the scenario that 
we assumed above. The second figure shows another plausible scenario: the baryon 
remains massless for some finite value of the quark mass, before the theory undergoes 
some kind of phase transition at m = m,. If this were to happen, it would nullify our 
previous conclusions. 


Fortunately, the second scenario cannot happen. It is ruled out by a theorem due 
to Vafa and Witten. In fact, there are a number of such theorems, all of which have 
rather similar proofs. We prove four such theorems below, the first two due to Vafa 
and Witten, the second two due to Weingarten. As we will see, the second Vafa-Witten 
theorem can be invoked to rule out the scenario shown in the second figure. 
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M baryon M baryon 


Figure 47: Two possible behaviours for the baryon mass. The Vafa-Witten theorem rules 
out the second option. 


A Positive Definite Measure 


Our setting is the QCD-like theories discussed throughout this chapter. All the the- 
orems that we will prove rely on the same property of the path integral: a positive 
definite measure. 


When computing correlation functions of gauge invariant operators, say O(x), we 
need to do the path integral. In Euclidean space, this takes the form 


(O(x)...O(ly)) = 3 [PA [Pv Dy; e7 SY Mth Pi P+m)yvi O(x)...O(y) 


Here Sym is the usual Yang-Mills action. For simplicity, we’ve given each quark a 
common mass, m which we take to be positive: m > 0. Clearly it would be simple to 
generalise this. For some applications below, we’ll explore the chiral limit by taking 
m — 0. In practice, we should also include gauge fixing terms in this expression, but 
these don’t affect the discussion below so we omit them for simplicity. 


It is straightforward to do the fermionic path integral, leaving us with the path 
integral over the gauge fields. This takes the form 


(O(z)...O(y)) = z | PA e*¥™ |det( P Fm)” O(x)... Oly) 


For many applications of interest, the operators O will also depend on the fermions. 
In this case, any fermion bi-linear should be replaced by its propagator in the usual 
manner. We’ll see examples below. 


We see that the effect of the fermions is to change the measure of the path integral 
over the gauge field. We write the correlation functions as 


(O(c)... O(y)) = I du O(2)... Oly) 
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where all the trickiness has now been absorbed in the measure 
1 
du = z | PA esm [det (D + m)|™! (5.45) 


The key observation is that this measure is positive definite. This is clearly true for 
the Yang-Mills part of the action, with Sym = z5 ftr Feu’ Fw. But it’s also true for 
the Dirac operator. This is because QCD is a vector-like theory. Suppose that, for a 
choice of gauge field A,,, the Dirac operator has a non-zero eigenvalue À € R, so there 
is an eigenspinor 


ipy = ày 
Then we also have an eigenvalue —À. This follows because {75, D} = 0, so 
ipy) = -i7 Db = -y4 


Of course, there may also be some number, n, of zero modes of D. The general form 
of the determinant is then 


det(D + m) = m” | [(m —id)(m + id) = m” | [m +> (5.46) 
A A 
which is manifestly positive definite providing m > 0. Before we go on, it’s worth 
pausing to make a couple of comments. 


e It’s important that we set the theta angle to zero, 0 = 0, for the following 
arguments. This is because the theta term comes with an e#”7? symbol, 


So f HE EEn (5.47) 


— 3272 
and so, when Wick rotated to Euclidean space, appears in the path integral as 
e. That extra factor of i means, when 6 4 0, the measure is not positive 
definite. 


e Relatedly, the mass should be positive m > 0. We can see this explicitly in the 
contribution from the n zero modes in (5.46). But it’s simpler to note that, from 
the chiral anomaly discussed in Section 3.3.3, a negative mass can be viewed as 
a non-zero 0 angle. 


e Clearly we needed the fermions to sit in a vector-like representation of the gauge 
group to argue for a positive definite measure. This means that many of the 
arguments we will make below fail in chiral gauge theories. 


Let’s now see what a positive definite measure buys us. 
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Theorem 1: Parity is not Spontaneously Broken 


Parity is a symmetry of QCD. One might wonder if it remains a symmetry of the 
ground state. The spontaneous breaking of parity would show up as a non-zero expec- 
tation value for some parity odd scalar operator, O(xz) which plays the role of an order 
parameter. We will argue that, in QCD, we necessarily have 


(0) =90 
for any parity odd scalar. 


To see this, consider the QCD Lagrangian deformed by the addition of this parity 
odd scalar. 


L(a) = LocD +a0 
To leading order in a, the energy density of the ground state is 
E(a) = E(0) + a(O) 


If parity is spontaneously broken in QCD, then there are two ground states and (O) is 
positive in one ground state and negative in the other. This means that spontaneous 
breaking of parity implies that E(a@) < E(0) for arbitrarily small a. 


Let’s now calculate (qa) in the path integral. We have 
e VE) = pu cia f dx O 


where V is the volume of (Euclidean) spacetime. The important point is the factor if i 
in the exponent. This arises in Euclidean space only for parity odd operators because 
they necessarily come with an odd number of € vpo. Indeed, we already saw an example 
of this with the @ term (5.47). We learn that adding a parity odd operator to the action 
changes the path integral by a phase. Because the measure is positive definite, this 
phase can only decrease the value of the path integral, so 


eV <e VFO + F(a) > E(0) 


We learn that the energy density has a minimum at œ = 0 which, in turn, tells us that 
parity is not spontaneously broken in vector-like theories. 


As a side remark, if we apply the argument above to the theta term itself (5.47), we 
learn that the addition of a theta term necessarily increases the energy of the vacuum: 
E(0) > E(0). This observation sits at the heart of axion attempts to explain why the 
QCD theta angle is so small in our world. 
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Theorem 2: A Bound on Current-Current Correlation Functions 


We now turn to the promised result: a relation between the masses of bound states 
and the bare masses of the underlying quarks. To proceed, we’re going to consider two 
point functions of currents. We will take 


Ja = pi” (T y 


where T° is some SU(N,) flavour generator . In terms of the path integral, the two 
point function can be written as 


(Jea) F(y)) = / du tr (4:T®S(a, ywT?S(y, 2) 


where the trace is over spinor and flavour indices, and the propagator takes the form 
1 

S(x,y) = (z| —— 5.48 

(x,y) = (z| Ey |y) (5.48) 


Note that this is the propagator evaluated in the background of a fixed gauge field A,,. 
The hard part is to then integrate over all gauge configurations, a procedure that is 
swept into the innocuous looking f du. Of course, we’re not going to be able to do this 
integral. But we will be able to make remarkable progress simply from the knowledge 
that the measure is positive definite. 


We will first give a slightly rough outline of the result, together with an explanation 
of why it shows what we want. We will then proceed with the proof and, along the 
way, see a number of further subtleties that we have to address. 


The basic idea is to first fix A,. We would then like invoke an inequality along the 
lines of 


HAE (5.49) 


where m is the bare mass of the quark, C is some constant, and |S (x, y)| refers to the 
matrix norm with respect to spinor and flavour indices. Crucially, this inequality must 
hold for any background gauge field A,, with the constants C and m independent of 
A,,. In other words, it should be a uniform bound. 


Such a uniform bound survives when averaged over all gauge fields with a positive 
definite measure. This then gives us a bound on the correlation function that we’re 
interested in, 


(Ji (z) JG) < Ole reel (5.50) 


— 288 — 


What is the interpretation of such a bound? Suppose that the lightest particle carrying 
the flavour quantum numbers of the current has mass M. Then, at large distances, we 
would expect the current-current correlation function to be dominated by exchange of 
this particle, meaning that 


(Tu(w) I" (y)) ~ ee 
The bound above tells us that the physical mass of the particle is bounded from below 
M > 2m (5.51) 
where m is the bare mass of the quarks. 
There are two immediate consequences of this result: 


e First, it rules out the possibility of massless bosons. This is important because an 
equal bare mass for all the quarks breaks the axial symmetry, but leaves behind 
the vector SU(Ny)y flavour symmetry. The result (5.51) tells us that this vector 
flavour symmetry cannot be spontaneously broken, for it were we would have 
massless Goldstone bosons with M = 0. 


We learn that the vector flavour symmetry is not spontaneously broken when the 
quarks have a bare mass. But if it’s not spontaneously broken for any m > 0, 
then it can only become spontaneously broken in the limit m — 0 if there is 
some miraculous accidental degeneracy, where a Lorentz invariant excited state 
decreases its energy, becoming exactly degenerate with the ground state at m = 0. 
This seems implausible. Under the assumption that no accidental degeneracy of 
this kind occurs, the Vafa-Witten theorem shows that vector-like symmetries are 
not spontaneously broken. 


e Secondly, the Vafa-Witten theorem rules out the existence of massless fermions 
when the bare mass of the quarks are non-vanishing. This, of course, was what 
we wanted to prove. 


Here, however, things are a little less straightforward and there is a subtlety that 
should be stressed. The calculation above holds in the presence of a finite UV 
cut-off, A. This was left implicit in the derivation, but the presence of the bare 
masses m in the inequality (5.51) is the hint that there is an underlying cut-off 
in the game. The Vafa-Witten theorem tells us that, for m # 0, there can be 
no massless composite fermion carrying flavour quantum numbers for any finite 
A. However, it does not rule out the possibility that a massless fermion emerges 
as A — oo and the cut-off is removed. Indeed, we know that it is only in this 
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limit that anomalies kick in so, strictly speaking, ’t Hooft anomaly matching only 
requires the existence of massless fermions in the A —> oo limit . For QCD, it is 
not believed that such behaviour happens. But the Vafa-Witten theorem isn’t as 
watertight as we might hope in showing this. 


A Proof of Theorem 2 


Let’s now prove the Vafa-Witten theorem (5.50). The trick is not to work with the 
propagator (5.48) between position eigenstates |x} and |y), but instead to work with a 
smeared propagator 


S(a,8) = (alg—— 8) 


where |a} and |3) are wavepackets that have support only in localised regions, separated 
by a distance R as shown below: 


a/a R > 
p 


The “localised support” means that A, (x)|a) = 0 for x outside of the region a, and 
similar for |G). We’ll soon see the advantage of working with these smeared propagators. 


To proceed, we use a standard trick to rewrite the propagator as 


S(a, b) = / dt (aje P+] 8) = / dt em (|e Pt gy 


Here t is just an artificial parameter that we’ve used to rewrite the integral. The next 
step is the clever one: we reinterpret t as a genuine time direction for a theory in 
d= 4+ 1 dimensions with Hamiltonian H = —i J. By causality, we know that a signal 
from region a takes at least time t = R to reach the region 3. This means that we 
must have 


(ale#*|8) =0 for0O<t<R 
Furthermore, at later times we can simply use the Cauchy-Schwarz inequality to bound 


(ale ™]6)] < v lala) (Bette) = y (ala) v (818) 


Ht 


where, in the second equality, we have used the unitarity of e’”*. This then gives us 


the promised uniform bound on the propagator 


IS(a, 8) < lal |B f at eo = UPI mi (5.52) 
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This is more or less that result that we wanted. It’s not quite the advertised bound 
on the propagator (5.49) because it uses wavepackets rather than position eigenstates. 
Nonetheless, it’s just as good for the purposes of proving what we want. The derivation 
also makes it clear why we needed smeared wavepackets; it’s because their norm |a| and 
|8| appear explicitly in the bound. In contrast, position eigenstates aren’t normalisable 
and so don’t work for our purposes. 


Theorem 3: The Pion is the Lightest Meson 


There are yet more applications of the positive definite measure. These are inequalities 
between the masses of various physical particles, first introduced by Weingarten. The 
first of these says that the pion is the lightest meson. 


We start by introducing the pseudoscalar meson field 
eae 
T= WY; 


where we have picked some i 4 7. In QCD, we would most naturally pick i = up quark 
and 7 = down quark, so that m is identified with the genuine pion. Here, we’ll refer to 
m as the pion for any i and 7. We give all quarks the same mass, m > 0, so that the 
SU(Ny)y vector symmetry is unbroken. The propagator of the pion is then 


(n(x)nt(y)) = / dys tt [S(,9) 7? S(y, 2) 7") 


where dy is the usual positive-definite measure (5.45), S(x, y) is the fermion propagator 
introduced in (5.48) and where the trace is over spinor and colour indices. Now, because 
{7°, D} = 0, we have 


Sly, r) = 7 (yD +m) le) = Yy|(—D +m)“ |) = (x|(P+m)ly)! (5.53) 


where the final equality follows because P is anti-Hermitian. Note that the |x) and |y) 
labels got swapped as part of taking the Hermitian conjugate; the remaining f acts on 
colour and spinor indices. But this means that we have 


(r(z)n'()) = fa E Seto (5.54) 


We learn that the pion propagator is positive definite. Note that the presence of the 
7° matrix was crucial to make this claim, since it gave us the Hermitian conjugate in 


(5.53). 
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Let’s now contrast this with the propagator for a scalar meson that doesn’t include 
the 7°. We have 


a = bith; 
where we take i # j to be the same indices as those carried by the pion. Repeating the 
arguments above, we now get 


(o(x)o'(y)) = f du tr[$(2,y) $(,2)] = / dy tr [S(e, y)7°S(e,y)ty] 


This means that we’re again summing over |S (x, y)|?, but this time with different plus 
and minus signs for different spinor indices, coming from the presence of 45 matrices 


in the final expression. We learn that we necessarily have 


(o(x)o"(y)) < (a(x)x"(y)) 


But, at large distances, we expect each of these correlation functions to be dominated 
by the mass of the corresponding meson (or the mass of the lightest particle carrying 
the same quantum numbers.) This means that the inequality above becomes, for large 


Iz — yl, 


g meae e-mle-vl sm > m 


This, of course, holds in our world because the pion is a Goldstone boson for broken 
chiral symmetry. Indeed, the mass inequality above can, like the Vafa-Witten theorem, 
be used to argue against the vector-like symmetry being broken, for then the o meson 
would be a massless Goldstone boson. The result above says that this can’t happen at 
finite m where the pion is massive, and so the sigma meson must also be massive. 


It is straightforward to repeat the arguments above with a different gamma matrix 
structure. For example, we could look at vector mesons of the form p = wy“w and 
show that these too are heavier than the pions. 

Theorem 4: Baryons are Heavier than Pions 


The second Weingarten inequality bounds the mass of the baryon. For QCD, with 
three colours, the baryon takes the form 


— abe, i nj k 
B =e paye 
Here i,j and k are flavour indices and a,b and c are colour indices. The spinor indices 


are left implicit; they could be contracted to form a spin-żŻ baryon, or uncontracted for 
a spin-2 baryon. As we’ve seen, two-point correlation function takes the form 


(B(x) Bi (y)) ~ ema 
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where mp is the mass of the lightest baryon sharing the quantum numbers of B. As 
previously, we take the bare masses of all quarks to m > 0. We then have the expression 


(B(x) Bi (y)) = eee” f dp tr [S (x, Y)aarS (a, y)ow S(2, y)ce] 


where du is again the positive-definite measure (5.45) and this time we’ve kept the 
colour indices explicit. First, we use the Cauchy-Schwarz inequality to bound 


3/2 
(BEBU) < f ay ( z Stet) 


a,a’,spinor 


Suppose that we could argue that, for any choice of background gauge field, 
|S(x,y)| < Cle"™z-9l (5.55) 


with C’ a constant, independent of the choice of gauge field, and m the bare mass of 
the quark. In this case, we would immediately have 


(BB wa Clem ay XO Shey)? = Clem! (r(x) at (y)) 
colour,spinor 

where, in the second equality, we’ve used our previous expression for the pion propaga- 
tor (5.54). Now, recall from the proof of the Vafa-Witten theorem that we don’t quite 
have (5.55), but we have something almost as good: we need to replace the position 
eigenstates |x) and |y} with smeared wavepackets |a} and |) and we can then derive 
the uniform bound (5.52). This will do for our purposes; we therefore come to the 
conclusion that mg < Mm, +m. In the limit that the bare mass vanishes, so m — 0, we 
learn that 


MB È Mr 


Of course, this is hardly groundbreaking information given what we know about par- 
ticle physics. But here it is derived from first principles, with no assumption of chiral 
symmetry breaking. Moreover, it tells us that if we wish chiral symmetry to be un- 
broken, with the ’t Hooft anomaly saturated by massless baryons, then the pion must 
also be massless. But this seems very unlikely, since if the pion is massless then there 
is nothing to stop it condensing and breaking the chiral symmetry after all. 


5.6.4 Chiral Gauge Theories Revisited 


The existence of a global symmetry with a ’t Hooft anomaly guarantees the existence 
of massless particles in the spectrum. If the symmetry is spontaneously broken, we 
have Goldstone bosons. If the symmetry is unbroken, we have massless fermions whose 
presence is needed to reproduce the anomaly. 
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So far, we have discussed situations in which ’t Hooft anomaly matching ensures the 
existence of massless bosons (together with the case of Ny = 2 where anomaly matching 
is ambivalent, but bosons arise anyway). Here we describe situations where massless 
fermions arise. Perhaps unsurprisingly, this typically happens in chiral gauge theories 
where tree-level fermion masses are prohibited by the gauge symmetry. 


We will focus on one of the simplest chiral gauge theories, 
G = SU(5) with two Weyl spinors: 7, in the 5 and y® in the 10 


Here a,b = 1,...,5 are the gauge group indices. The classical theory has two global 
symmetries: U(1), and U(1),, each rotating the phase of a single fermion. One com- 
bination of these suffers a mixed anomaly with SU(5). The surviving generator is 


Q = 3Qy — Qx 
This has a ’t Hooft anomaly 
A= J) @=5x3?+10x (-1) = 125 
fermions 
Let us now suppose that the theory confines, leaving the U(1)g unbroken. The simplest 


colour singlet is the 3-fermion bound state 


Papx” (5.56) 
This has charge Q = 5, giving an infra-red contribution to the ’t Hooft anomaly 
ASF = 125 
We see that it is plausible that this fermion bound state does indeed remain massless. 


A Different Perspective 


We can reach the same conclusion through a rather different argument. Suppose that 
a fermion bi-linear forms a condensate. Since any such bilinear is necessarily charged 
under the gauge group, the condensate will partially Higgs the gauge symmetry. What 
symmetry breaking patterns occur? 


This is not completely straightforward. We can make a number of different fermion 
bilinears, each decomposing into some number of channels. Based on the computation 
of the classical force between quarks described in section 2.5.1, some of these channels 
will be attractive and some repulsive. It seems likely that the condensate forms in an 
attractive channel, but there are several of these. 
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At this point, we need to use a little guesswork. The most naive approach is to 
determine which quark pair has the most attractive force and assume that the conden- 
sate forms in this channel. This is clearly optimistic — after all, we’re dealing with a 
strongly coupled theory and the classical force calculation is unlikely to provide quan- 
titative guidance — but does give sensible answers in many cases. It is known as the 
maximally attractive channel criterion. More generally in these situations, one tries 
different possibilities and sees which outcomes seem the least baroque. Note that, in 
contrast to the QCD-like theories, we cannot turn to the lattice for help because there 
are various obstacles to discretising chiral fermions. 


For the problem in hand, it is thought that the naive, most-attractive channel hy- 
pothesis does give rise to the correct physics. In fact, there are two channels which are 
equally attractive. These are: 


5C5@10 and 5c 10810 
We therefore postulate the existence of two quark condensates 
(Wax®) =o and (yet) = dA, (5.57) 
These two condensates are not gauge invariant. Between them, they could break the 
SU (5) gauge group to either SU(4) (if they lie parallel to each other) or SU(3). Again, 


we have to engage in a little guesswork. We will assume that they line up, with 
o? = 06" and A, = Ada. The gauge group is then broken to 


G = SU (5) pave = SU (4) phage 
Naively, each of the condensates breaks the non-anomalous U (1) global symmetry, with 
Q(o) = 2 and Q(A) = —2. However, as in the previous section, we can define a new, 
unbroken global symmetry by mixing the U(1) with a suitable generator of the SU(5) 
gauge symmetry, 
1 

Q = Q T zdag(4, =, =, =l; —1) 

At low-energies, the gauge and global symmetry groups are 
G = SU (4) gauge X ULY 
Decomposing each fermion into representations of this new group, we have 
w: 53 = 45/2 ® 15 and x: 10_, > 6o Ð 4-5/2 


The (Yx) condensate in (5.57) gives mass to 45/2 ® 4-5/2, while the (xx) condensate 
gives mass to 69 ® 69. This leaves us with the gauge singlet 15. This has the same 
quantum numbers as the massless composite fermion (5.56) that we anticipated by ’t 
Hooft anomaly matching. 
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Although we’ve had to engage in some guesses along the way, we end up with a 
plausible situation: the low energy dynamics of the chiral SU(5) theory consists of a 
single, free Weyl fermion. This can either be viewed as a composite fermion (5.56) in 
a confining theory, or as a fundamental fermion in a theory with quark condensates 
(5.57): the end result is the same. 


We could also ask if there are other possibilities which look equally plausible. For 
example, is it possible that the global U(1)g is spontaneously broken, resulting in a 
massless boson instead of a massless fermion. For this to happen, we need to con- 
struct a bosonic, gauge invariant condensate. The simplest contains six fermions — 
Papx pe pax — and it seems unlikely that such a condensate would form. 


More Chiral Gauge Theories 


The SU(5) gauge theory described above is not the only one which is thought to confine, 
giving massless composite fermions. Indeed, the same behaviour is thought to occur for 
the two classes of chiral gauge theories introduced in Section 3.4.2. The first of these 
is: 


G = SU(N) with a H and N — 4 O Wey] fermions 


we denote the N — 4 fermions in the anti-fundamental representation as w and the 
fermion in the anti-symmetric as x. The theory has a SU(N — 4) x U(1) global sym- 
metry, where the SU(N — 4) factor rotates the w fields, while the non-anomalous U(1) 
charges are given by 


Qy=N-2 and Q,=4-N 


Assuming that this theory confines, the question is: what becomes of this global sym- 
metry. As we have seen, if it is to survive unscathed then there must be a massless, 
composite fermion that reproduces the ’t Hooft anomaly. A candidate is the collection 
of 3-fermion bound states that, schematically, take the form ÀA = yxy. Displaying all 
the indices, this is 


(Aa)ij = Vpai en Xn War; (5.58) 


where a, 8, y = 1,2 are spinor indices, i, j = 1,..., N—4 are SU(N —4) flavour indices, 
and a,b = 1,...,N are SU(N) gauge indices. If you track through all the symmetry 
properties, you'll find that A;; is symmetric in tj, so this spinor transforms in the 


symmetric representation of the SU(N — 4) global symmetry group. It also has 


charge Q = N. It is not hard to check that these massless fermions A do indeed saturate 
the ’t Hooft anomalies, and therefore provide a good candidate for the infra-red physics 
of this theory. 
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As with the SU(5) model, there is also a complementary approach to deriving the 
same result in which one first assumes that fermi bilinears (yw) and (xx) condense, 
breaking the gauge group SU(N) — SU(4), with all fermions pairing up expect for a 
lone A with the same quantum numbers that we saw above. 


The second chiral gauge theory that we met earlier is similar, but has quarks in the 
symmetric rather than anti-symmetric representation 


G = SU(N) witha J and N +40 


This time there is a global SU(N +4) symmetry, together with a single non-anomalous 
U(1) under which the anti-fundamental fermions y and the symmetric fermion x have 
charges 


Qy=N+2 and Q,=-(N+4) 


Once again, it seems plausible that the theory confines without breaking the SU(N + 
4) x U(1) global symmetry, with the ’t Hooft anomalies saturated by a fermion (5.58). 


Tracking through the symmetrisation, this time A sits in the anti-symmetric H repre- 


sentation of the global symmetry group SU(N + 4), again with charge Q = N. A few 
short calculations show that the ’t Hooft anomalies do indeed match. 


5.7 Further Reading 


Spontaneous symmetry breaking is a powerful and unifying idea, explaining disparate 
phenomena in both particle physics and condensed matter physics. It is responsible for 
the existence of phonons in a solid and, as we have seen, the existence of pions in the 
strong force. When implemented in gauge theories, it provides a unified explanation 
for superconductivity and the electroweak vacuum. 


Jeffery Goldstone was the first to realise that a spontaneously broken global sym- 
metry gives rise to a massless particle — what we now call the Goldstone boson. He 
made this conjecture, and provided examples, in a 1961 paper whose title — “Field 
theories with Superconductor Solutions” — reveals the early cross-fertilisation between 
condensed matter and particle physics [78]. The general proof of the theorem followed 
soon afterwards in a paper with Salam and Weinberg [79]. 


Goldstone’s theorem was initially viewed with some dismay in particle physics. The 
existence of strictly massless bosons was ruled out by experiment, suggesting that spon- 
taneous symmetry breaking had little role to play at the fundamental level. This, of 
course, was too hasty. Subsequent work by Higgs and others, exploring symmetry 
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breaking in gauge theories, provided the underpinning for the Standard Model. Mean- 
while, it was realised that an approximate global symmetry could be spontaneously 
broken, resulting in an approximate Goldstone boson. (The name pseudo-Goldstone 
boson was coined by Weinberg, apparently to Jeffrey’s annoyance.) 


The discovery of what we would now call chiral symmetry was actually made slightly 
before Goldstone’s insight. In 1960, Yoichiro Nambu explained that an exact axial- 
vector current in beta decay would imply the existence of a massless pion field [140]. 
Like many papers of the time, it avoids the language of field theory and instead focusses 
on the “current algebra” , in which one works with commutation relations between cur- 
rents and their matrix elements. This somewhat masks the connection to spontaneous 
symmetry breaking, which is not emphasised in the paper. This was one of the (many!) 
contributions for which Nambu was awarded the 2008 Nobel prize. 


A more modern formulation of the chiral Lagrangian came only in the mid-1960s. 
Gell-Mann and Levy introduced the sigma model [72]. In fact, they introduced two 
versions: the first is what we might call a “linear sigma model” and includes the field 
g, related to the pion fields by a constraint o? + #? = 1. Embarrassed by the new 
field which had not been observed in experiments, they subsequently integrated out to 
derive the “non-linear sigma model”, now named after a particle that does not exist 
and does appear in anywhere in the theory. The group-theoretic formulation of the 
non-linear sigma model that we used here is due to Weinberg [206], and was extended 
to general groups in [22]. 


The idea that baryons could arise as solitons in the chiral Lagrangian was proposed 
by Tony Skyrme, in a remarkably prescient pair of papers written in 1960 and 1961 
and [184, 185]. These papers were apparently written without any awareness of the 
work described above, and were essentially ignored for more than a decade while the 
story of chiral symmetry breaking unfolded. The papers came to prominence only in 
the 1980s when it was realised that they played an important role in the story. The 
term “skyrmion” was coined in a 1984 meeting in honour of Tony Skyrme. (In a cute 
twist, the second paper thanks ”Mr A. J. Leggatt” for performing the calculations as 
an undergraduate student. This mis-spelled student went to win the Nobel prize.) 


The WZW term was introduced by Witten in 1983 [226]. The arguments in Section 
5.5 are largely taken from this paper. (Many of Witten’s papers from this time are 
masterclasses in clarity; the best way to learn much of modern physics is simply to 
read Witten’s papers.) As we saw, for Nr = 2 there is no WZW term, but the fact 
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that topology can determine the quantum statistics of the Skyrmion was noted by 
Finkelstein and Rubinstein, back in 1968 [60]. 


More on the history of chiral symmetry breaking can be found in the article by 
Weinberg [209]. More details about the physics of chiral symmetry breaking can be 
found in the lecture notes of Scherer and Schindler [173] and Peskin [153]. 


The idea that anomalies place severe constraints on the spectrum of strongly inter- 
acting gauge theories was first emphasised by ’t Hooft in the lectures [105], with the 
application to chiral symmetry breaking that we described in these lectures. This was 
elaborated on by Frishman, Schwimmer, Banks and Yankielowicz, [64]. The “persis- 
tent mass condition”, prohibiting the formation of massless bound states using massive 
constituents, was framed by Preskill and Weinberg [163] and found a more rigorous 
grounding in the Vafa-Witten theorems [195, 196]. The mass inequalities, which also 
make use of the positive definite measure, were first introduced by Weingarten (very) 
slightly before the Vafa-Witten theorem [210]. The idea that the Higgs and confining 
phases provide complementary, but equivalent, viewpoints on the dynamics of chiral 
gauge theories was first enunciated in [189]. 
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6. Large N 


Non-Abelian gauge theories are hard. We may have mentioned this previously. Indeed, 
it’s not a bad summary of the lectures so far. The difficulty stems from the lack of 
a small, dimensionless parameter which we can use as the basis for a perturbative 
expansion. 


Soon after the advent of QCD, ’t Hooft pointed out that gauge theories based on the 
group G = SU(N) simplify in the limit N —> oo. This can then be used as a starting 
point for an expansion in 1/N. Viewed in the right way, Yang-Mills does have a small 
parameter after all. 


At first glance, it seems surprising that the theory simplifies in the large N limit. 
Naively, you might think that the theory only gets more complicated as the number 
of fields increase. However, this intuition breaks down when the fields are related by a 
symmetry, in which case the collective behaviour of the fields becomes stiffer as their 
number increases. This results in a novel, classical regime of the theory. The weakly 
coupled degrees of freedom typically look very different from the gluons that we start 
with in the original Lagrangian. 


Large N limits are now commonplace in statistical and quantum physics. As a 
general rule of thumb, the large N limit renders a theory tractable when the number 
of degrees of freedom grows linearly with N. (We shall meet two examples in Section 7 
when we discuss the CP’! model and the Gross-Neveu model.) In contrast when the 
number of degrees of freedom grows as N°, or faster, then the theory simplifies but, 
apart from a few special cases, cannot be solved. This is the case for Yang-Mills where 
the large N limit will not allow us to demonstrate, say, confinement. Nonetheless, it 
does provide an approach which allows us to compute certain properties. Moreover, it 
points to deep connection between gauge theory and string theory, one which underlies 
many of the recent advances in both subjects. 


You might reasonably wonder whether the large N expansion is likely to be relevant 
for QCD which has N = 3. We’ll see as we go along how useful it is. A common 
rebuttal, originally due to Witten, is that in natural units the fine structure constant 
is 


Ae © 137 = e & 0.30 


This comparison is a little unfair. The true expansion parameter in QED is better 
phrased as a/47 ~ 1073. In contrast, there are no factors of 47 that ride to the rescue 
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for Yang-Mills. The expansion parameter is 1/N or, in many situations, 1/N?. We 
might therefore hope that this approach will give us results that are quantitatively 
correct at the 10% level. 


6.1 A Quantum Mechanics Warm-Up: The Hydrogen Atom 


We start by providing a simple example where a large N limit offers a novel way to 
apply perturbation theory. The set-up is very familiar: the hydrogen atom. 


In natural units, A = c = éo = 1, the Hamiltonian of the hydrogen atom is 


1 a 
H =-~W?-— 6.1 

2m r se) 

with a the fine structure constant. In our first course on Quantum Mechanics, we learn 
the exact solution for the bound states of this system. But suppose we didn’t know 


this. Can we try to approximate the solutions using perturbation theory? 


Since there’s a small number, a ~ 1/137, sitting in the potential term, you might 
think that you could expand in a. But this is misleading. In the context of atomic 
physics, the fine structure constant cannot be used as the basis for a perturbative 
expansion. This is because we can always reabsorb it by a change of scale. Define 
r' = mar. Then the Hamiltonian becomes, 


1 1 
H= 2 ay axe 
ne? [-2y2 -1 
We see that the fine structure constant simply sets the overall scale of the problem. 
This means that we expect the order of magnitude of bound state to be around 


atomic = —ma? xX —27.2 eV 


In fact, the ground state energy is Fatomic/2 * —13.6 eV, the factor of 1/2 coming from 
solving the Schrodinger equation. 


For our purposes this means that the hydrogen atom is, like Yang-Mills, a theory 
with a scale but with no small, dimensionless parameter. How, then, to construct a 
perturbative solution? One possibility is to generalise the problem from three dimen- 
sions to N dimensions. The Hamiltonian remains (6.1), but now with V? denoting the 
Laplacian in RY rather than R?. Clearly we have increased the number of degrees 
of freedom from 3 to N. We have also increased the symmetry group from SO(3) to 
SO(N). 
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We note in passing that we are not solving the higher dimensional version of the 
hydrogen atom, since in that case the Coulomb force would fall-off as 1/r’~?. Instead, 
we keep the Coulomb force fixed as 1/r and vary the dimension of space. 


To see how this helps, we will focus on the s-wave sector. Here the Schrodinger 
equation becomes 


1d (N-1)d 1 
= = = = E 

2 dr”? DF d'i r v Y 
At leading order in 1/N, we can replace the (N — 1) factor by N. We'll do this 
because the equations are a little simpler, although if we were serious about pursuing 


Hu = ma? ( 


perturbation theory in 1/N, we would have to be more careful. We can now remove 
the term that is first order in derivatives by redefining the wavefunction as y(r’) = 
x(r')/r'N/?, leaving us with the rescaled Schrödinger equation 
1 a? N? 1 
Qdr'2 8r? r 


Hx = ma? ( )x=By 


We’ll make one further rescaling, and define a new radial coordinate, r’ = N?R. The 
Schrodinger equation now becomes 

1 1 
8R? R 


ma? 1 Me . 
Hy = re (-an T | Von()) x=Ex with Vie(R) = 
This rescaling has removed all N dependence from the effective potential. Instead, we 
see that it appears in two places: the overall scale of the problem; and the effective 
(dimensionless) mass of the particle, which can be read off from the kinetic term and 
is Mer = N?. 


We're left with a very heavy particle, moving in the one-dimensional effective poten- 
tial Veg(R). In this limit, we can expand the potential in a Taylor series around the 
minimum Rmin = 1/4. To leading order, we can then treat the problem as a harmonic 
oscillator, centred on Rmin- Higher order terms in the Taylor series will affect the energy 
only at subleading order in 1/N 


To leading order, the ground state energy is given by Veg(Rmin). (The zero point 
energy of the harmonic oscillator is suppressed by 1/meg ~ 1/N?). This gives us our 
expression for the ground state of the harmonic oscillator, 


ma? 1 
Eground = N2 (2 + O (=) 


If we now revert to the real world with N = 3, we get Egrouna © Ima? /9. The true 


answer, as we mentioned above, is Eground = ma? ND 
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Of course, it’s a little perverse to apply perturbation to a problem for which there 
is an exact solution. But the key idea remains: the extra degrees of freedom, together 
with the restriction of O(N) symmetry, combine to render the problem weakly coupled 
in the limit N — oo. We will now see how a similar effect occurs for Yang-Mills theory. 


6.2 Large N Yang-Mills 
The action for SU(N) Yang-Mills theory is 


1 V 
Sym = -z5 f ate tr FY Fiv 


There is an immediate hurdle if we try to naively take the large N limit. As we saw in 
Section 2.4, confinement and the mass gap all occur at the strong coupling scale Agcp 
which, at one-loop, is given by 


3 (4r)? 
Aecp = Ayy exp (- l ) ) 


22 g2N 


If we keep both the UV cut-off Ayy and the gauge coupling g? fixed, and send N —> ov, 
then there is no parametric separation between the physical scale Agcp and the cut-off. 
This is bad. To rectify this, we define the t Hooft coupling, 


A=g°N 


We will consider the theory in the limit N — oo, with both Ayy and A held fixed. 
This ensures that the physical scale Agcp also remains fixed in this limit. Indeed, 
throughout this section we will discuss how masses, lifetimes and scattering amplitudes 
of various states scale with N. In all cases, it is Agcp which fixes the dimensions of 
these properties. 


With these new couplings, the Yang-Mills action is 
N 
SYM = jae tr F Fw (6.2) 
2A 
This is the form we will work with. 


6.2.1 The Topology of Feynman Diagrams 


To proceed, we’re going to look more closely at the Feynman diagrams that arise from 
the Yang-Mills action 6.2. We’ll see that, in the ’t Hooft limit N — oo, A fixed, there 
is a rearrangement in the importance of various diagrams. 
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We will write down the Feynman rules for Yang-Mills. Each gluon field is an N x N 
matrix, 


The propagator has the index structure 
i i l 
(Ai (2) Abi(y)) = Ale =y) (8154 ~ 5754 5) 


where A, (x) is the usual photon propagator for a single gauge field. The 1/N term 
arises because we’re working with traceless SU(N) gauge fields, rather than U(N) 
gauge fields. But clearly it is suppressed by 1/N and so, at leading order in 1/N, we 
don’t lose anything by dropping this term. We then have 


(Ai (x) AE,(y)) = Awl — y) 6 5% 


This means that we’re really working with U(N) gauge theory rather than SU(N) 
gauge theory 


At this point, it is useful to introduce some new notation. The fact that the gauge 
field has two indices, 7,7, suggests that we can represent it as two lines in a Feynman 
diagram rather than one. One of these lines represents the top index, which trans- 
forms in the N representation; the other the bottom index which transforms in the N 
representation. Instead of the usual curly line notation for the gluon propagator, we 
have 

: a (6.3) 


SCCCCS > : Ree 
N 


Note that each line comes with an arrow, and the arrows point in opposite ways. This 
reflects the fact that the upper and lower lines are associated to complex conjugate 
representations. The propagator scales as A/N, as can be read off from the action 
(6.2). 


Similarly, the cubic vertex that come from expanding out the Yang-Mills action take 


ff N 
N 5 


the form 
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where we’ve now included the 7,7,k = 1,..., N indices to show how these must match 
up as we follow the arrows. (There is also a second diagram from the cubic vertex in 
which the arrows are reversed.) Similarly, the quartic coupling vertex becomes 


p Dy y 
VN 


Each vertex comes with a factor of N/A. This also follows from the action 6.2. The 
fact that the vertex comes with an inverse power of the coupling might be unfamiliar, 
but it is because of the way we chose to scale our fields. It will all come out in the 
wash, with the propagators compensating so that increasingly complicated diagrams 
are suppressed by powers of A as expected. We’ll see examples shortly. 


As we evaluate the various Feynman diagrams, we will now have a double expansion 
in both À and in 1/N. We'd like to understand how the diagrams arrange themselves. 
The general scaling will be 


#propagators #vertices 
cea ( a propag (X ) N#index contractions (6.4) 


A 


where the index contractions come from the loops in the diagram. To see this more 
clearly, it’s best to look at some examples. 


Vacuum Bubbles 


To understand the Feynman diagram expansion, let’s start by considering the vacuum 
bubbles. The leading order contribution is a diagram which, in double line notation, 
looks like, 


QV Gyrr 


Here the first two factors come from the 3 propagators and the 2 vertices in the diagram. 
The final factor is important: it comes from the fact that we have three contractions 
over the indices i,j,k = 1,...,N. These are denoted by the three arrows in the 
diagram. Note that we get a contribution from the outside circle since we’re dealing 
with vacuum bubbles. 
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Similarly, at the next order in A, we have the diagram 


QV Gysor 


There are now four contractions over internal loops. This diagram has the same N? 
behaviour as our first one-loop diagram, but it is down in the expansion in ’t Hooft 
coupling. It is easy to convince yourself that the two diagrams above give the leading 
contribution (in N) to the free energy, which scales as ~ O(N?). This reflects the fact 
that Yang-Mills theory has N? degrees of freedom. 


However, there is another diagram that we could have drawn. This has the same 
momentum structure as (6.5), but a different index structure. In double line notation 
it takes the form, 


E 67 


If you follow the loop around, you will find that there is now just a single contraction 
of the group indices. The result is a contribution to the vacuum energy which occurs 
at the same value of \ as (6.5), but is suppressed by 1/N? relative to the first two 
diagrams. This means that in the limit N — oo, with à fixed this diagram will be 
sub-dominant. 


We see that, among all the possible Feynman diagrams, a subset dominate in the 
large N limit. The dominant diagrams are those which, like (6.5) and (6.6), can be 
drawn flat on a plane in the double line notation. These are referred to as planar 
diagrams. In contrast, diagrams like (6.7) need a third dimension to draw them. These 
non-planar diagrams are subleading. 


The large N limit has seemed to simplify our task. We no longer need to sum over 
all Feynman diagrams; only the planar ones. This remains daunting. Nonetheless, as 
we will see below, this new structure does give us some insight into the strong coupling 
dynamics of non-Abelian gauge theory. 
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The Gluon Propagator 


The ideas above don’t just apply to the vacuum bubbles. A similar distinction holds 
for any Feynman diagram. We can, for example, consider the gluon propagator (6.3). 
A planar, one-loop correction is given by the diagramr 


Now we sum only over the indices on the internal loop, because we have fixed the 
external legs. We see that this again gives a contribution with the same 1/N scaling 
as the original propagator (6.3), but is down by a power of the ’t Hooft coupling. 


Meanwhile, the following two-loop, non-planar graph scales as 


=E (5) ~m 
~O- 


and is suppressed by 1/N? compared to the earlier contributions. 


The Topology of Feynman Diagrams 
Let’s understand better how to order the different diagrams. We’ll return to the vacuum 


diagrams. The key idea is that each of these can be inscribed on the surface of a two 
dimensional manifold of a given topology. 


The planar diagrams can all be drawn on the surface of a sphere. This 
is because for any graph on a sphere, you can remove one of the faces and dA 
flatten out what’s left to give the planar graph. The simplest example is T7 
the vacuum diagram (6.5) which sits nicely on the sphere as shown on the 


right. Figure 48: 


In contrast, the non-planar diagrams must be drawn on higher genus surfaces. For 
example, the non-planar vacuum diagram (6.7) cannot be inscribed on a sphere, but re- 
quires a torus. It also requires more artistic skill than I can muster, but looks something 


like 
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Figure 49: Examples of the simplest Riemann surfaces with x = 2,0 and —2. 


In general, the Feynman diagram tiles a two dimensional surface X. The map is 
E = # of edges = # of propagators 
F = # of faces = # of index loops 
V = # of vertices 
From (6.4), a given diagram then scales as 


diagram ~ NFtVY-£)E-V 


But there is a beautiful fact, due to Euler, which says that the following combination 
determines the topology of the Riemann surface 


\(=)=F+V-E (6.8) 


The quantity .(%) is called the Euler character. It is related to the number of handles 
H of the Riemann surface, also called the genus, by 


y(Z) =2-2H (6.9) 


The simplest examples are shown in the figure. The sphere has H = 0 and y = 2; the 
torus has H = 1 and y = 0; the thing with two holes in has H = 2 and y = —2. In 
this way, the large N expansion is a sum over Feynman diagrams, weighted by their 
topology 


diagram ~ NXA”-V 


For each genus, the Riemann surface can be tiled in different ways by Feynman diagram 
webs, giving the expansion in the ’t Hooft coupling. There is no topological interpre- 
tation of this exponent V — E. We’ll shortly discuss the implication of this large N 
expansion. 


The Euler Character 


Before we proceed, it will be useful to get some intuition for why the Euler character 
(6.8) is a topological invariant, and why it is given by (6.9). 
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To see the former, it’s best to play around a little bit by deforming various diagrams. 
The key manipulation is to take a face and shrink it to vanishing size. For example, 


ATA 


Under such a transformation, the number of faces shrinks by 1: F => F—1. The number 
of vertices has also decreased, V — V — 2, as has the number of edges, E > E — 3. 


we have 


But the combination x = F + V — E remains unchanged. 


In all the examples above, we used only the cubic Yang-Mills vertex. Including the 
quartic vertex doesn’t change the counting. This is because we can always split the 
quartic vertex into two cubic ones, 


XA 


The left hand side has V = 1 and E = 4, which transforms into the right hand side 
with V = 2 and E = 5. We see neither y, nor the power of À depend on the kind of 
vertex that we use. 


This should help explain why the Euler character does not vary under manipulations 
that make the diagram more and more complicated, but leave the underlying topology 
unchanged. For the sphere, the example we drew above shows that x = 2. For each 
extra handle, we can consider first consider cutting a hole in the surface. We do this 
by removing a face, leaving us with a boundary. To build a handle, we cut out two 
faces, each of which is an n-gon. This reduces the number of faces F + F — 2. Now 
we glue the faces together by identifying the perimeters of the holes. This act reduces 
E => E —n and V —> V —n. But the net effect is that for each handle we add, 
x>x-2. 


6.2.2 A Stringy Expansion of Yang-Mills 


The large N limit of Yang-Mills has been repackaged as a sum over Riemann surfaces 
of different topologies. But this is the defining feature of weakly coupled string theory. 
This is discussed in much detail in the lectures on String Theory; here we’ll just mention 
some pertinent facts. 
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In string theory, the sum over Riemann surfaces is weighted by the string coupling 
constant gs. By analogy, we see that 


Js= WF 


But there are also differences. In string theory, the Riemann surfaces are smooth 
objects, which suffer quantum fluctuations governed by the inverse string tension a’. 
This is a quantity with dimension [a’] = —2 and it is often written as a’ = l? with l, 
the typical size of a string. The fluctuations of the Riemann surface are really governed 
by a’/L? where L is the spatial size of the background in which the string propagates. 


In contrast, the Riemann surfaces that arise in the large N expansion are not smooth 
at all; they are tiled by Feynman diagrams and in the perturbative limit, A < 1, 
the diagrams with the fewest vertices dominate. However, taken naively, it appears 
that in the opposite limit A >> 1, the diagrams with large numbers of vertices are 
important. With some imagination, these can be viewed as diagrams which finely 
cover the Riemann surface, so that it looks more and more like a classical geometry. 
This suggests that, in the ’t Hooft limit, strongly coupled Yang-Mills may be a weakly 
coupled string theory in some background, with 


o(a) 
T2 
where I’ve admitted ignorance about the positive exponent #. 


This is a bold idea. Weakly coupled string theory is a theory of quantum gravity, 
and gives rise to general relativity at long distances. If we can somehow make the idea 
above fly, then Yang-Mills theory would contain general relativity! But the strings and 
gravity would not live in the d = 3+ 1 dimensions of the Yang-Mills theory. Instead, 
we would find gravity in the “space in which the Feynman diagrams live”, whatever 
that means. 


So far, no one has made sense of these ideas for pure Yang-Mills. However, it is 
now understood how these ideas fit together in a very closely related theory called 
maximally supersymmetric (or M = 4) Yang-Mills which is just SU(N) Yang-Mills 
coupled to a bunch of adjoint scalars and fermions. In that case, the strongly coupled 
’t Hooft limit is indeed a theory of gravity in a d = 9 + 1 dimensional spacetime that 
has the form AdS; x S°. The d = 3 + 1 dimensional world in which the Yang-Mills 
theory lives is the boundary of AdS;. This remarkable connection goes by the name of 
the AdS/CFT correspondence or, more generally, gauge-gravity duality. It is a topic 
for another course. 
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It’s an astonishing fact that, among the class of gauge theories in d = 3+1 dimensions, 
is a theory of quantum gravity in higher dimensional spacetime. It leaves us wondering 
just what else is hiding in the land of strongly coupled quantum field theories. 


6.2.3 The Large N Limit is Classical 


We can use the large N counting described above to understand the scaling of correla- 
tion functions. 


In what follows, we consider gauge invariant operators which cannot be further de- 
composed into colour singlets. Since Yang-Mills has only adjoint fields, this means that 
we are interested in operators that have just a single trace. The simplest is 


Gig (2) = tr Fae) 


There’s a slew of further operators in which we add more powers of F,» inside the trace. 
However, it’s important that the number of fields inside the trace is kept finite as we 
take N — oo, otherwise it will infect our N counting. This means, for example, that 
we can’t discuss operators like det F, F””. Of course, Yang-Mills also has non-local 
operators — the Wilson loops — and much of what we say will hold for them. But, for 
once, our main interest will be on the local, single trace operators. 


We could also consider coupling our theory to adjoint matter, either scalars or 
fermions. Restricting to the adjoint representation means that these new fields are 
also N x N matrices, and the same 1/N counting that we developed above holds for 
their Feynman diagram expansion. This gives us the option to build more single trace 
operators, such as G = tr(¢™) for a scalar ¢, or combinations of scalars and field 
strengths. Once again, we insist only that the number of fields inside the trace does 
not scale with N. 


We can compute correlation functions of any of these operators by adding sources in 
the usual way, 


1 
Sym = N fae — py E Fw +. + SaGa 


where the ... is any further adjoint matter that we’ve included, and where the operators 
G, denote any single trace involving strings of the field strength, the other adjoint 
matter, or their derivatives. Note that we’ve scaled both fields and operators to keep 
an overall factor of N in front of the action. The connected correlation functions can 
be computed in the usual way by differentiating the partition function, 


1 6 ò 
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log Z[J] (6.10) 


aL = 


where the subscript c is there to remind is that we’re dealing with connected correlators. 
Because the action, including the source terms, has the form S = N tr (something), our 
previous large N counting goes over unchanged, and the free energy is dominated by 
planar graphs at order log Z ~ N?. (This conclusion would no longer hold if we 
included multi-trace operators as sources, or if there were some other powers of N 
that had somehow snuck unseen into the action.). We learn that connected correlation 
functions of single trace operators have the leading scaling 


(Gi... Goje ~ N? (6.11) 


where in this formula, and others below, we’re ignoring the dependence on the ’t Hooft 
coupling A. 


The simple formula (6.11) is telling us something interesting: the leading contribution 
to any correlation function comes from disconnected diagrams, rather than connected 
diagrams. For example, any two-point function has a connected piece (GG) ~ (G)(G) ~ 
N?. This should be contrasted with the connected piece which scales as (GG), ~ N°. 


This means that the strict N — oo limit of Yang-Mills is a free, classical theory. All 
correlation functions of single trace, gauge invariant operators factorise. Said slightly 
differently, quantum fluctuations are highly suppressed in the large N limit, with the 
variance of any gauge singlet operator O given by 


(AG)? = (GG) — (GIG) = (GG). ~ N° > 


Usually when we hear the words “free, classical theory”, we think “easy”. That’s not the 
case here. The large N limit is a theory of an infinite number of single trace operators 
G,(x). If the theory is confining and has a mass gap, like Yang-Mills, each of these 
corresponds to a particle in the theory. (We will make this connection clearer below.) 
Or, to be more precise, each of the operators G(x) corresponds to some complicated 
linear combination of particles in the theory. After diagonalising the Hamiltonian, we 
will have a free theory of an infinite number of massive particles. Determining these 
masses is a difficult problem which remains unsolved. 


The large N limit does not only hold for confining theories. For example, maximally 
supersymmetric Yang-Mills is a conformal field theory and does not confine. Now the 
goal in the large N limit is to diagonalise the dilatation operator to find the conformal 
dimensions of single trace operators. This is a difficult problem that is largely solved 
using techniques of integrability. 
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The fact that the large N limit is free leads to the concept of the master field. There 
should be a configuration of the gauge fields A, on which we can evaluate any correlation 
function to get the correct N — oo answer. (If we add more adjoint matter fields, we 
would need to specify their value as well.) Once we have this master field, there is 
nothing left to do: no fluctuations, no integrations. We just evaluate. Furthermore, 
the master field should be translationally invariant so, at least in a suitable gauge, the 
A,, are just constant. In other words, all of the information about Yang-Mils in the 
N — œ limit is contained in four matrices, A,. The twist, of course, is that these are 
oo X co matrices and, as a well known physicist is fond of saying, “you can hide a lot in 
a large N matrix”. For pure Yang-Mills in d = 3+ 1 dimensions, no progress has been 
made in understanding the master field in decades. For maximally supersymmetric 
Yang-Mills, the master field should be equivalent to saying that the theory is really ten 
dimensional gravity in disguise. 


6.2.4 Glueball Scattering and Decay 


The strict N — œ limit is free, with the degrees of freedom organised in single trace op- 
erators G(x). All of the difficulties of the strong coupling dynamics goes into diagonal- 
ising the Hamiltonian to determine masses (or scaling dimensions) of the corresponding 
states. 


At large, but finite N, we introduce interactions between these degrees of freedom, 
which must scale as some power of 1/N. Even though we can’t solve the N > co limit, 
we can still get some useful intuition for the theory by looking at these interactions in 
a little more detail. 


To see this, let’s revert to pure Yang-Mills. We will assume that this theory confines 
in the large N limit. There is no reason to think this is not the case but it’s important 
to stress that we can currently no more prove confinement in the large N limit than at 
finite N1. We consider the local glueball operators 


G(x) Stee) (6.12) 


for some m > 2. We’ve ignored the Lorentz indices, which endow each operator with a 
certain spin. We could also include derivatives to increase the spin yet further. 


"The Millennium Prize Problem requires that you prove confinement for all compact non-Abelian 
gauge groups. This stipulation was put in place to avoid a scenario where confinement was proven 
only in the large N limit. Apparently, the authors of the problem originally meant to find a different 
phrasing, one that avoided the caveat of large N but would award a proof of confinement in, say, 
SU(3) Yang-Mills. But they never got round to changing the wording. Like with all such prizes, if 
you're genuinely interested in the million dollars then you are probably in the wrong field. 
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At large N, there is a connected component to the two point function which, with 
the normalisation (6.11), scales as 


(G(x) G(0))e ~ N° 


which means that G(x) creates a glueball state with amplitude of order 1. In terms 
of our original Feynman diagrams, this picks up contributions from very complicated 
processes, such as the one below 


In the large N limit, this is converted into tree-level propagation of gauge singlet 
operators created by G(x). Importantly, the operator G(x) creates only single-particle 
states. To see this, we can cut the diagram to see the intermediate state, as shown 
below 


We’ve now included i,7 = 1,...,N indices to help keep track. To make something 
gauge invariant, we need to take the trace, which means combining each index with its 
partner. The only way to do this is to include all the internal legs together. This is the 
statement that the internal state corresponds to a single trace operator. In contrast, 
multi-particle states only propagate in non-planar diagrams where the internal lines 
can be combined into multi-trace colour singlets. 


The fact that the single-trace operator G(x) creates single particle states also follows 
from the scaling of the correlation function (6.11). To see this, first suppose that the 
statement isn’t true, and G creates a two particle state with amplitude order 1. Then 
one could construct a suitable correlation function which has the value (GGGG) ~ 1, 
with the operators G each interacting, with amplitude 1, with one of the the two 
intermediate particles. But we know from large N counting (6.11) that (GGGG) ~ 
1/N?. (There is actually an implicit assumption here that there is no degeneracy of 
states at order N. But this is precisely the assumption of confinement.) 
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So we can think of any two-point function (G(x) G(0)). as the tree-level propagation 
of confined, single particle states. We are repackaging 


planar graphs single particles 


In general, the only singularities in tree-level graphs are poles. (This is to be contrasted 
with one-loop diagrams where we can have two-particle cuts, and higher loop diagrams 
with multi-particles cuts.) This means that there should be some expansion of the 
two-point function in momentum space as 


Cwsc- 5 ei (6.13) 


where an = (0|G|n), with |n} the single particle state with mass Mp. But now there’s 
something of a puzzle. At large k, Yang-Mills theory is asymptotically free, and we can 
compute this correlation function to find that it scales as 


(G(k) G(—k)}e > k? log k? 


Yet naively the propagator (6.13) would appear to scale as 1/k? for large momentum. 
The only way we can reproduce the expected log behaviour is if there are an infinite 
number of stable intermediate states |n), with an infinite tower of masses m,. This 
coincides with our earlier expectations: as N —> oo Yang-Mills is a theory of an infinite 
number of free particles. 


At large but finite N, there can no longer be an infinite tower of stable, massive 
particles. The heavy ones surely decay to the light ones. But this process is captured 
by the correlation functions of the schematic form 


which tells us that the amplitude for a glueball to decay to two glueballs scales as 1/N, 
so their lifetime scales as N?. Similarly, for scattering we can turn to the four-point 


e009) ~S H + X to 
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function 


So the amplitude for gluon-gluon scattering scales as 1/N?. 


6.2.5 Theta Dependence Revisited 

We saw in Section 2.2 that Yang-Mills theory comes with an extra, topological pa- 
rameter: the theta-term. How does this fare in the large N limit? The Lagrangian 
is 


1 YV 0 * V 
Lym = me FE + oqo tt Ew Fr 


1 7 

N | ——tr Fp F + ——-tr Fy FH 
(-xtrF Temy ) 

With the appropriate factor of N sitting outside the action, we see that we should keep 

0/N fixed as we send N — co. The first question that we should ask is: does the 

physics still depend on 0? 


At first glance, it appears that the answer to this question should be no. The reasons 
for this are two-fold. At leading order in perturbation theory, none of the planar graphs 
appear to depend on @. Moreover, the instanton effects which, at weak coupling, give us 


—8r? N/A 


_gn2/q2 : 
ð dependence now scale as ~ e787 /% ~ e and so are exponentially suppressed 


in the large N limit. 


Although both of these arguments appears compelling, the conclusion is thought 
to be wrong. It is believed that, at leading order in the 1/N expansion, the physics 
continues to depend on @ (or, more precisely, on 0/N). Perhaps the simplest observable 
is the ground state energy, defined schematically in the Euclidean path integral as 


eV FG) — [ea exp (- faz Lys) (6.14) 


where V is the spacetime volume. Recall that, in Euclidean space, the theta term 
weights the path integral as e’”” where v is the topological winding of the configuration. 
The large N arguments that we’ve seen above tell us that E ~ N?. It is believed that 
the 0 dependence affects this quantity at leading order 


E(0) = N*h ($) (6.15) 


for some function h(x). 
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There are two main reasons for thinking that 6 dependence survives in the large N 
limit. The first is that, in the presence of light quarks, the dependence can be seen 
in the chiral Lagrangian; we will describe this in Section 6.4. The second is that both 
the arguments we gave above also hold in toy models in two-dimensions (specifically 
the CP model that we will introduce in 7.3) where one can see that they lead to 
the wrong conclusion. The loophole lies in the first argument; at leading order in the 
1/N expansion we must sum an infinite number of diagrams, and interesting things can 
happen for infinite series that don’t arise for finite sums. 


To make this more concrete, let’s introduce the topological susceptibility, 
x(k) = f cue’ * (tr (Fu h(a) tr (Foo F” (0))) (6.16) 


(Not to be confused with the Euler character that we encountered earlier.) Roughly 
speaking, this tells us how the theory responds to changes in 0. In particular, the 
ground state energy E(0) has the dependence 


PE I a 
do? (saan) fina xf) oe 


We can compute contributions to x(k) in perturbation theory. One finds that, at 
leading order in 1/N, each individual diagram has x(k) + 0 as k — 0. Nonetheless, it 
is expected that the sum of all such diagrams does not vanish. No one has managed to 
perform this calculation explicitly in four-dimensional Yang-Mills theory. To see that 
such behaviour is indeed possible, you need only consider the series 
fk) =k y. c log” k? = k? exp™ 8" = 1 
=. nl 

The behaviour of the ground state energy (6.15) brings a new puzzle. The energy 
depends on 6/N, but must obey E(0) = E(0 + 27). How can we reconcile these two 
properties? The accepted answer — and the one which is seen in the CP’ model — is 
that there is a level crossing in the ground state as 0 is varied. This works as follows: at 
large N the theory is thought to have a large number of meta-stable, Lorentz-invariant 
states that differ in energy. There are order N such states and, in the k'", the energy 
is given by 


6 + 2nk 
E;,(0) = nin ( = ) 


The ground state energy is then 
E(0) = min, Ek (6.18) 
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Figure 50: The vacuum energy as a function of 6. 


We’re left with a function which is periodic, but not smooth. In particular, when 6 = 7 
two levels cross. 


What does the function E(0) look like? First, we know that it has its minimum at 
0 = 0. This is because the Euclidean path integral (6.14) is a sum over configurations 


0 -V E(0) 


weighted by e. Only for 0 = 0 is this real and positive, hence maximising e ; 


and so minimising E(0). Taylor expanding, we therefore expect that 


ae 1 
E(0) = min, zC + 2rk)? +O (x) 


where C = y(0)/(167?.N)?. This is shown in the figure. 


A general value of 0 explicitly breaks time-reversal or, equivalently, CP. The two 
exceptions are 0 = 0 and 0 = 7. (We explained why 0 = 7 is time reversal invariant in 
Section 1.2.5). But, at 0 = 7, there are two degenerate ground states and time-reversal 
invariance maps one to the other. We learn that, at large N Yang-Mills, time-reversal 
invariance is spontaneously broken at 0 = m. This coincides with our conclusion from 
Section 3.6 using discrete anomalies. 


6.3 Large N QCD 


Our discussion in the previous section focussed purely on matrix valued fields. To get 
closer to QCD, we add quarks, as Dirac fermions in the fundamental representation. 


We rescale the quark field Y — v Ny, so that the action continues to have a factor 
of N sitting outside, 


1 E 
SQCD = N fate = at F Fiw + ip Dy 
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We'll stick with just a single quark field for now, but everything that we say will go 
over for N; flavours of quarks provided that we keep Ny fixed as N — oo. 


The quark field carries just a single gauge index, y’ with i = 1,..., M. Correspond- 
ingly, it is represented by just a single line in a Feynman diagram, 


A 


Vv 


N 


Meanwhile, the quark-gluon vertex is represented by 


~N 


We can now repeat the large N counting that we saw previously. We can start by 
looking at contributions to the vacuum energy that include a quark loop. For example, 
we have 


D-D tie 


where the first factor of N? comes from the two quark-gluon vertices, while the second 
factor comes from the index loops. We see that this is subleading compared to the 
pure glue vacuum diagrams which are ~ N?. Including extra internal gluons, all planar 
diagrams with a single quark loop on the boundary will continue to scale as ~ N. This 
is the leading order contribution to the vacuum energy that includes quarks. This is 
simple to understand: the amplitude to create a quark is the same as the amplitude to 
create a gluon, but there are N? gluon degrees of freedom and only N quark degrees 
of freedom. 


If the quark loop does not run around the boundary, the diagram is suppressed yet 
further. For example, consider the diagram 


aa ~ PEL ~ ASNT 


N 


Similarly, if we include internal quark lines in other Feynman diagrams, say the gluon 
propagator, we again get a suppression factor of 1/N. 
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We can again interpret the large N Feynman diagrams in terms of 2d surfaces. 
However, now the surfaces are no longer closed. Instead, each quark loop should be 
thought of as the boundary of a hole on the Riemann surface. Each boundary increases 
the number of edges Æ by one, so a given Feynman diagram again scales as 


diagram ~ NEVE )E-V = yx)B-V 


which is the same result that we had before. But now the expression for the Euler 
character is 


y¥=2-2H-B 
where B is the number of boundaries, or holes, in the surface. 


In terms of string theory, the addition of quarks means that the large N limit includes 
open strings, with boundaries, as well as closed strings. This is closely related to the 
concept of D-branes in string theory. 


6.3.1 Mesons 
We can now rerun the arguments of Sections 6.2.3 and 6.2.4 for large N QCD. In 
addition to the glueball operators (6.12), we also have the meson operators 


I(x) = VN OF" (6.19) 


where the F™ can denote any number of field strengths, derivatives and gamma matri- 
ces, so that J (x) is a local, gauge invariant operator that cannot be decomposed into 
smaller colour singlets. 


Note that we’ve included an overall factor of VN in (6.19). To see why this is, we 
compute correlation functoins 


(Fic gee Ne? (6.20) 


The first factor of N comes from the planar diagrams with a quark loop running 
along the boundary. The normalisation factor of VN in (6.19) means that correlation 
function scale as N~?/? rather than as N~?. This normalises the two-point function as 
(ITee ~ N°, so J creates a meson state with amplitude 1. 
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The same arguments that we used for pure Yang-Mills still apply here. The strict 
N > œ limit is again a free theory, now including infinite towers of both glueball and 
meson states. In momentum space, the analog of the propagator (6.13) is 


Jnl? 


(T(k) I(-k))e= DG (6.21) 


where bn = (0|7|n), with |n} the single particle meson state with mass mp. As for glue- 
balls, this expression is only compatible with the log behaviour of asymptotic freedom 
if there is an infinite tower of massive meson states. 


At large N, the three point function of meson fields 


1 
JI I) = 
IIIN JN 
tells us that the amplitude for a meson to decay into two lighter mesons scales as 1/ VN. 
The lifetime of a meson is then typically of order N. They are shorter lived than the 
glueballs. Similarly, the four point function of meson fields is 


1 
(IIIT) ~ N 


The amplitude for meson-meson scattering scales as 1/N. 


We can also compute correlation functions of both glueballs and mesons. At leading 
order, we have 


(Rh... IG.. Gà ~ N NPP NI 


This means that the two-point function (7G) ~ 1/V N, so mesons and glueballs don’t 
mix at large N, even if they share the same quantum numbers. (We had assumed 
when talking separately about meson and glueballs above, so it’s good to know it’s 
true.) We can also extract the amplitude for a gluon to decay into two mesons which 
is (GI J) ~ 1/N, which is the same order as the decay into two gluons. Meanwhile, 
the amplitude for a meson to decay into two gluons is (7GG) ~ 1/N3/2. We see that 
a gluon doesn’t much mind who it decays into, while a meson greatly prefers decaying 
into other mesons. 


The OZI Rule 


The large N approach helps explain a couple of phenomenological facts that had been 
previously observed to hold for QCD. In particular, note that the leading order meson 
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decays have the form 


N — 


VN 


In such a process, one of the original quarks ends up in each of the final decay products. 
In contrast, a process in which the two original quarks decay into pure glue which 
subsequently produces two further mesons, is suppressed by an extra factor of 1/N, 


This suppression was observed experimentally in the early days of meson physics and 
goes by the name of the OZI rule (for Okubo, Zweig and Iizuka; it is also sometimes 
called the Zweig rule). 


The standard example is the ¢ vector meson, which has quark content ss. On energy 


° none of 


considerations alone, one would have thought this would decay to m*a~7 
which contain a strange quark. In reality, this decay is suppressed by QCD dynamics, 
and the ¢@ meson decays primarily to KTK, where the positively charged kaon has 


quark content us. This fact is clearest in the 1/N expansion. 


The large N expansion also makes it clear that we don’t expect to see meson bound 
states or, more generally, gqgqq states with four quarks. Such states are referred to as 
exotics. The amplitude for meson interactions scales as 1/N, so such exotics certainly 
don’t form in the large N limit. The lack of exotics in particle data book suggests that 
this suppression extends down to N = 3. 


6.3.2 Baryons 


We now turn to baryons. These are a little more subtle because they contain N quarks, 
anti-symmetrised over the colour indices. Nonetheless, as first explained by Witten, 
they are naturally accommodated in the large N limit of QCD. 


In what follows we will consider the large N limit with just a single flavour of quark, 
although it is not difficult to include Ny > 1 flavours. The baryon is then 


B = iNap a.. Diy (6.22) 
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This is the large N analog of, say, the At* in QCD which contains three up quarks, 
or the A~ which contains three down quarks. 


We can start by modelling these as N distinct quark lines. A gluon exchange between 


any pair of quarks is 
i 
-—> Bi 2 (6.23) 
N ; 


where we’ve been more careful in the second diagram in showing how the arrows flow. 


However, there EN (N — 1) ~ N? different pairs of quarks, so the total amplitude for a 
gluon exchange within a baryon is order N. 


There is a similar story for three body interactions. The gluon exchange is now 


~ (6.24) 


but there are order N? triplets of quarks, so again the total amplitude scales as N. 


These simple arguments suggest that many-body interactions are all equally impor- 
tant, and contribute to the energy of the baryon at order N. It is therefore natural to 
guess that 


Mbaryon ~ N (6.25) 


This is perhaps not a surprise since the baryon contains N quarks, and is certainly to 
be expected in the non-relativistic quark model. 


There’s a calculation which may give you pause. Consider the the gluon exchange 
between two different pairs of quarks, 


1 


But now there are ~ N* ways of picking two pairs of quarks, so it looks as if this 
contributes to the energy at order N4 x N~? ~ N?. It seems like we get increasingly 
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divergent answers as we look at more and more disconnected pieces. In fact, this is 
the kind of behaviour that we would expect if the baryon mass scales as (6.25). The 
propagator for large times T then takes the form 
T 1 
e~ baryon?’ NS 1 — t Mbaryond — 5 a + tees 
For the diagram (6.26), each of the gluons can be exchanged at any time and so it 
corresponds to the second order term in the expansion above which, we see, should 


2 2 
indeed scale as Myon ~ N°. 


At this point, we could start to explore the interactions between baryons and mesons, 
and build towards a fuller phenomenology of QCD. However, we won’t go in this di- 
rection. Instead, I will point out a nice connection between baryons in the large N 
expansion and another recurring topic from these lectures. 


The Hartree Approximation 


A particularly simple way to proceed is to assume that the quarks are non-relativistic. 
This is not particularly realistic for QCD, but it will provide a simple way to shine a 
light on the structure of the baryon. If each quark has mass m, we could try to model 
their physics inside a baryon by the following Hamiltonian 


N 

A=Nm+ D + D + a 2 V3 (Tij, Zik) +... 
i=1 tA tAGFk 
where zij = x; — x; and the coefficients in front of the potentials are taken from (6.23) 
and (6.24). We should also include all multi-particle potentials. As we have seen, it is 
a mistake to think that these potentials are genuinely suppressed by the 1/N factors 
in the Hamiltonian: these are compensated by the sums over particles, so each term 
ends up of order N. 


There is a straightforward variational approach to such many-body Hamiltonians 
called the Hartree approximation. It is the first port of call in atomic physics, when 
studying atoms with many electrons, and we met it in the lectures on Topics in Quan- 
tum Mechanics. The idea is to work with the ansatz for the ground state wavefunction 
given by 


N 
Yxn.. xn) = | [b 
i=1 
Note that the quarks are fermions, but they have already been anti-symmetrised over 


the colour indices (6.22), so it is appropriate that the wavefunction for the remaining 
degrees of freedom is symmetric. 
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The Hartree ansatz neglects interactions between the quarks. Instead, it is a self- 
consistent approach in which each quark experiences a potential due to all the others. 
This approach becomes increasingly accurate as the number of particles becomes large, 
so it is particularly well suited to baryons in the large N limit. 


Evaluating the Hamiltonian on the Hartree wavefunction gives 
1 3 2,1 3... 48 2 
WIH S=N [m+ zz | dx lox) +5 | drd e Vo(a12)|0(x1)6(x2)| 


te 5 | Endras Vz(£12, X23) |6(*1)6(x2)6(x3)|? 


We then find the ¢(x) which minimises this expression. This, obviously, is a hard 
problem. But fortunately it is not one we need to solve in order to extract the main 
lessons. These come simply from the fact that there is a factor of N outside the 
bracket, but nothing inside. This confirms our earlier conclusion (6.25) that the mass 
of the baryon indeed scales as Mparyon ~ N. But we also learn something new, because 
whatever function ¢(x) ends up being, it certainly does not depend on N. This means 
that the size of the baryon — its spatial profile in @(x) — is order 1. 


The mass and size of the baryon are rather suggestive. Recall that the large N limit 
is a theory of weakly coupled gauge singlets, interacting with coupling 1/N. This means 
that the mass of the baryon scales as the inverse coupling, N, with the size independent 
of the coupling. But this is the typical behaviour of solitons. For example, the ’t Hooft 
Polyakov monopole that we met in Section 2.8 has a mass which scales as 1/g? and a 
size which is independent of g?. This strongly suggests that the baryon should emerge 
as a soliton in large N QCD. 


We have, of course, already seen a context in which baryons emerge as solitons: 
they are the Skyrmions in the chiral Lagrangian that we met in Section 5.3. To my 
knowledge, this connection has not been fully explained. 


Before we move on, there is one further twist to the “baryons as solitons” story. The 
mass of the baryon, N, is not quite like the mass of the monopole: it is proportional to 
the inverse coupling, rather than the square of the inverse coupling. Returning to the 
language of string theory that we introduced in Section 6.2.2, the mass of the baryon 


scales as 


1 
Mbaryóń Gi Js 


with gs = 1/N the string coupling constant. This suggests that baryons are a rather 
special kind of soliton: they are D-branes. These are objects in string theory on which 
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strings can end, and have a number of magical properties. (You can read more about 
D-branes in the lectures on String Theory.) With its N constituent quarks, the baryon 
is indeed a vertex on which N QCD flux tubes can end. 


6.4 The Chiral Lagrangian Revisited 


In this section, we will see what becomes of the chiral Lagrangian at large N. Let’s 
first recall the usual story: Yang-Mills coupled to Ny massless fermions has a classical 
global symmetry 


G =U(N;)r x UNS )R (6.27) 


However, the anomaly means that U(1),4 does not survive the quantisation process, 
leaving us just with U(1)y x SU(N) x SU(Np)r. This is subsequently broken to 
U(1)y x SU(Np)v, and the resulting Goldstone modes are described by the chiral 
Lagrangian. 


How does this story change at large N. The key lies in the anomaly, which is given 
by 


gN; 
812 


A, J% = 


wr re (6.28) 


In the large N limit, we send g? — 0 keeping À = g’?N fixed. This suggests that the 
anomaly is suppressed in the large N limit and the quantum theory enjoys the full 
chiral symmetry (6.27). This means that there is one further Goldstone mode that 
appears: the 7’ meson. In this section we will see how this plays out. 


6.4.1 Including the 7/ 


Our first steps are a straightforward generalisation of the chiral Lagangian derived in 
Section 5.2. The chiral condensate takes the form 


(Wip) = Ody 


but now with © € U(Ns) rather than SU(N;). (The ugly 7,7 = 1,..., Np flavour 
indices are to ensure that we don’t confuse them with the i, j colour indices we’ve used 
elsewhere in this Section.) As before, we promote the order parameter to a dynamical 
field, & — X(x), whose ripples describe our massless mesons, transforming under the 
chiral symmetry G as 


D(x) > E(x) R (6.29) 
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with L x REG. The overall phase of X is our new Goldstone boson, 17, 
det X = et /Sy (6.30) 


We would now like to construct the Lagrangian consistent with the chiral symmetry 
(6.29). Unlike, in Section 5.2, we now have two different terms with two derivatives, 


(trD'd,5)? and tr(0,210"D)? (6.31) 


The first term vanishes when © C SU(N), but survives when © C U(Ny). In other 
words, it provides a kinetic term only for 7. Meanwhile, the second term treats all 
Goldstone modes on the same footing. 


Large N-ology tells us that all these mesons have the same properties and, in par- 
ticular, to leading order in 1/N we have fy = fz. This means that we need only the 
second kinetic term and the chiral Lagrangian takes the same form as (5.7), 


2 
L= T tr(3 X102)? 


We can compute the expected scaling of f, with N. Recall that the pion decay constant 
fr is defined by (5.13) 


(Ol 1 (a)l?) = =i ppe? 


with J; a generator of the SU(N,) flavour current. At this point we need to be a 
little careful about normalisations. The current J above is defined with the usual 
kinetic kinetic term £ ~ iŅPy. Meanwhile, our large N counting used a different 
normalisation in which there was an overall factor of N outside the action. Chasing 
this through, means that the current Jz is related to the appropriate normalised large 
N current (6.19) by 


In =VNIt 
We can then use the general result (6.20) to find 


(JrJe) = X (Olz aJ ~N = (0|Jz|n) ~ VN 


n 


This means that the pion decay constant scales as 


fa~ VN 


— 327 = 


6.4.2 Rediscovering the Anomaly 


So far, things are rather easy. Now we would like to consider what happens at the 
next order in 1/N. Obviously, we could add the other kinetic term in (6.31), splitting 
fy and fr. This doesn’t greatly change the physics and we will ignore this possibility 
below. Instead, there is a much more dramatic effect that we must take into account, 
because the anomaly now gives 7’ a mass. How do we describe that? 


We can isolate 7 by taking the determinant (6.30), and therefore introduce a mass 
term by 


2 
E a tr(O, 510")? = =f m, (—ilog det £)? 


Here m7, is the mass which must vanish as N — oo. We will see shortly that mẹ, ~ 1/N. 


It is unusual to include a log term in an effective action. However, as we will now 
see, it captures a number of aspects of the anomaly. To illustrate this, let’s first add 
masses for the other quarks. As we saw in Section 5.2.3, this is achieved by including 
the term 


2 
1 
L= fes Frey ("£t 8,5) — Str (MX +¥'Mt) — T ma (—ilog det X)? 


with M a complex mass matrix. By a suitable SU(N;) x SU(Ny) rotation, we can 
always choose 


M = NM 


where M is diagonal, real and positive. This final phase can be removed by a U (1) 
rotation, © > e~**/NY to make the mass real. But this now shows up in the mass term 
for the 1, 


2 
L= pes fe tr (O“XT 8,5) — Str (ME + ZIM!) — ma (—ilog det X — 0)? 


However, we’ve played these games before: in Section 3.3.3, we saw that rotating the 
phase of the mass matrix in equivalent to introducing a theta angle. We conclude that 
this is how the QCD theta angle appears in the chiral Lagrangian. 


We can now minimise this potential to find the ground state. With M diagonal, the 
ground state always takes the form 


£ = diag Ge ae eos) 
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The exact form depends in a fairly complicated manner on the choices of mass matrix 
M and theta angle. To proceed, we must make some assumptions. We will take m, 
much bigger than all other masses, which means that we first impose the second term 
as a constraint 


We further look at the simplest case of a diagonal mass matrix: M = mly, with 
m > 0. We will then see how the ground states change as we vary 0. 


For 6 = 0, the ground state sits at X = 1. Now we increase 6. What happens next 
differs slightly for Ny = 2 and Ny > 2. Let’s start with Ny = 2. As we increase 0, 
the ground state moves to ¢; > 0 and the overall magnitude of the potential decreases. 
At 0 > a, the ground state tends towards ¢,; = 7/2. At 0 = v itself, the potential 
vanishes for all 6, which is symptomatic of a second order phase transition. If we now 


increase 9 just a little more, the ground state jumps to ¢,; = —7/2, before moving back 
towards ¢; = 0 as 0 + 27. The sequence is shown in the plots below for 6 = 0, an T 


An 
and a 


Ep) Eo) Ee) 

2 2 2 2 

1 | 8=0 1 | =2n/3 1 | an 6=4ri3| 1 | 
2 Cid -2 2 Th -2 2 4 kg -4 -2 


The fact that the potential vanishes when 0 = 7 is special to Ny = 2. The story 
for Ny > 3 is similar, except that there are now just two degenerate vacua at 6 = 7. 
This is characteristic of a first order phase transition. The potential for Ny = 3 for 


7 =, an m and = is shown below. 


cal gw Fe 
8=0 |1 8=27/3 | 1 e=rn| 1 1 8=4r13 
4 
> 4 


=329 = 


6.4.3 The Witten-Veneziano Formula 


So far, we’ve happily incorporated the new 7’ Goldstone boson into our chiral La- 
grangian. However, this brings something of a puzzle, which is to reconcile the following 
facts: 


e The ground state energy is E(@) ~ N? and depends on 0. 
e Quarks contribute to quantities such as ÆE(0) at order N 


e All 6 dependence vanishes if we have a massless fermion. 


These three facts seem incompatible. How can the ~ N contribution from quarks 
cancel the ~ N? contribution from gluons to render E(@) independent of 0? 


To see how this might work, let’s consider schematically the contribution to the 
susceptibility (6.16) 


a = Nb? 
x(k) = D 2 — M2 KD k2 — m2 


glueballs mesons 
where M, are the masses of glueballs, m, the masses of mesons, and a, and b, the 
amplitudes for tr F,,*F"” to create these states from the vacuum, 


(O|tr F*F|n™ glueball) = Na, , (O|tr F*F|n™ meson) = vV Nb, 
We want the second term to cancel the first in the limit k — 0. We can achieve this 
only if there is some meson whose mass scales as m? ~ 1/N. But this tallies with our 


discussion above; we expect that the 7’ becomes a genuine Goldstone boson in the large 
N limit. We’re therefore led to the conclusion 


0 E 6.32 
x( Yang— Mills m ( ) 


But we can now use our anomaly equation (6.28) to write 


2 
ET oap) = OA) 

But we know from our discussion of currents in the chiral Lagrangian (5.13) that 
(0| J4) = —i./NyfrDu- (The factor of \/N; here is a novel normalisation, but ensures 
that fr is eae of Ny in the large N limit.) We therefore find that VNby = 
(8r? N/4/N FA) frm. Inserting this into (6.32), and using (6.17), we have 

m ANy E 
a f2 dé? lo=0 


This is the Witten- Veneziano formula. Rather remarkably, it relates the mass of the y’ 


VNby = (0|F*F|n’) = 


meson to the vacuum energy y(0) of large N, pure Yang-Mills theory without quarks. 
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It’s worth pausing to see how the N scaling works in this formula. While E(0) ~ N, 
we expect that d?E/d6? is of order 1. Meanwhile, fe ~ VN. We then see that 
m? ~ 1/N as anticipated previously. 


We don’t know how to measure the topological susceptibility x(0) experimentally. 
Nonetheless, we can use the Witten-Veneziano formula, with my ~ 950 and fr ~ 93 
MeV and Ny = 3 to get d’E/d0? = (150MeV)*. 


6.5 Further Reading 


The large N expansion in Yang-Mills was introduced by ’t Hooft in 1974 [97]. (t Hooft 
was astonishingly productive in those years!) Although we didn’t cover it in these 
lectures, ’t Hooft quickly showed how these methods could be used to solve QCD in 
two dimensions, a theory that is now referred to as the ’t Hooft model [98]. 


The discussion of baryons in the 1/N expansion is due to Witten [221], as is the 
1/D expansion in atomic physics [223]. Witten goes on to apply the 1/D expansion to 
helium. It’s clever, but also shows why chemists tend not to adopt this approach. 


The fact that, despite all appearances, dependence on the @ angle survives in the 
large N limit was first emphasised by Witten in [220]. The large N limit of the chiral 
Lagrangian was constructed in [224, 41], and the Witten-Veneziano formula was intro- 
duced in [222, 199]. The symmetry breaking pattern needed for the chiral Lagrangian 
can be proven in the large N limit: this result is due to Coleman and Witten [31]. The 
idea that QCD at 6 = 7 spontaneously breaks time reversal was pointed out pre-QCD 
and pre-theta by Dashen [38] and is sometimes referred to as the Dashen phase. 


The tantalising connection between string theory and the large N expansion can be 
made explicit in a number of low dimensional examples; the lectures by Ginsparg and 
Moore are a good place to start [75]. In d = 3+1 dimensions, this relationship underlies 
the AdS/CFT correspondence [129]. 


Coleman’s lectures remain the go-to place for a gentle introduction to the 1/N ex- 
pansion [32]. Manohar has written an excellent review of the phenomenology of large 
N QCD [132]. Any number of reviews on the gauge/gravity duality also contain a 
discussion of 1/N and its relationship to string theory: I particularly like the lectures 
by McGreevy [135]. 
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7. Quantum Field Theory on the Line 


In this section, and the next, we describe the physics of relativistic quantum field 
theories that live in d = 1 + 1 and d = 2 + 1 dimensions. 


There are several reasons to be interested in quantum field theories in lower dimen- 
sions. Perhaps most importantly, these field theories play important roles in condensed 
matter systems. However, it turns out that it is often easier to solve quantum field the- 
ories in lower dimensions. This makes them a testing ground where we can understand 
some of the subtleties of field theory and build some intuition for the kinds of issues 
arise when the interactions between fields becomes strong. 


As we go down in dimension, we find an increased richness in the interactions that a 
field theory can enjoy. More specifically, we find an increase in the number of relevant 
and marginally relevant interactions that theories admit. These are the terms that 
drive us from weakly coupled physics in the UV towards something more interesting 
in the IR. In d = 3 + 1, this can only be achieved by non-Abelian gauge fields. As we 
will see below, in lower dimensions we have other options. This means that Yang-Mills 
theory, which has dominated our lectures so far, becomes somewhat less prominent in 
the story of lower dimensional quantum field theories. 


7.1 Electromagnetism in Two Dimensions 


Maxwell theory in d = 1 + 1 dimensions is rather special. The gauge field is A,,, with 
u = 0,1 and the corresponding field strength has just a single component Fo;. The 
action is given by 


1 
S = fer — 5a -+ Aj” 


where j” denotes the coupling to charged matter. Note that we have retained the 
notation of Yang-Mills theory where the coupling constant e? sits outside the action. 
With this convention, the matter is taken to have integer valued electric charge. 


Electromagnetism in d = 1 + 1 dimensions has a number of properties that are 
rather different from its d = 3 + 1 dimensional counterpart. These occur both at the 
classical and quantum levels. Let’s first look at some basic classical properties. The 
first difference comes in the pure Maxwell theory, which has equation of motion 


F” = ð F” = 0 (7.1) 


We see that this allow only for a constant electric field. In particular, there are no 
electromagnetic wave solutions in d = 1 + 1 dimensions. 
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This is an important point and it’s worth explaining from a slightly different per- 
spective. In general d dimensional spacetime, the gauge field is A, with the index 
running over u = 0,1,...,d — 1. However, not all of these components are physical. 
The standard way to isolate the physical degrees of freedom is to use the gauge sym- 
metry A, —> A, + 0,w to set Ap = 0. This leaves us with only the spatial gauge fields 
A. However, we still have to impose the equation of motion for Ag which is solved by 
insisting that V - A=0. This projects out the longitudinal fluctuations of A, leaving 
us just with the transverse modes. The upshot is that the gauge field in d dimensions 
carries d — 2 physical degrees of freedom. When d = 3 + 1, these are the familiar two 
polarisation modes of the photon. However, in d = 1+ 1 dimensions, there are no 
transverse modes and the electromagnetic field has no propagating degrees of freedom. 


Now let’s look at what happens when we add matter. The classical equations of 
motion are 


1 v v 
zau F" =~] 


We can consider placing a point charge q at the origin, so the equation that we have 
to solve is 


1 
ar = q(x) => F” = qe’6(r)+€ (7.2) 


where 6(x) is the Heaviside step function (0(x) = 0 for x < 0 and (x) = 1 for x > 0) 
and € is a constant background electric field which is typically fixed by the choice of 
electric field at spatial infinity. We see that the electric field emitted by a point charge 
in d= 1+1 dimensions is constant. (This is the same as the statement that a uniform 
surface charge in d = 3+ 1 dimensions gives rise to a constant electric field.) 


The energy contained in the electric field is 


1 
H = J dx gam (7.3) 


This means that a classical point charge in d = 1 + 1 dimensions costs infinite energy. 
The finite energy states must be charge neutral. To this end, consider a charge q at 
position x = —L/2 and a charge —q at position x = +L/2. We have the equation of 
motion 


qe? x € (-L/2,+L/2) 


(7.4) 
0 otherwise 


OF" = q[6(—L/2) —6(+L/2)]) => F%= 
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where we have chosen the integration constant € in (7.2) to ensure vanishing electric 
field at x = +00. The total energy (7.3) stored in the electric field is 


We see that the energy grows linearly with the separation. In other words, electric 
charges in d = 1+ 1 dimensions are classically confined. The reason is that the electric 
field is forced to form a flux tube, simply because it has nowhere else to go. 


7.1.1 The Theta Angle 


As we described above, pure Maxwell theory in d = 1+1 dimensions has no propagating, 
wave-like solutions. This does not, however, mean that the theory is completely devoid 
of content. The classical equations of motion (7.1) still allow for constant electric fields. 
As we now explain, this is enough to give rise to a Hilbert space in the quantum theory. 


We also take this opportunity to add a new ingredient to pure Maxwell theory. This 
is a 0 term, analogous to the 0 terms which we met in four dimensional gauge theories 
in Sections 1.2 and 2.2. (In fact, such a term exists in any even spacetime dimension.) 
The action is 


1 8 
S= f dx (rè + Zra) (7.5) 


Like its four-dimensional counterpart, the theta term is a total derivative and does 
not affect the classical equations of motion. Nonetheless, it does affect the quantum 
spectrum. 


Our first task is to isolate the dynamical degrees of freedom in pure Maxwell theory. 
This is best illustrated by taking the theory to live on R x St where we take the spatial 
S! to have radius R. Although the theory has no propagating degrees of freedom, there 
is a single physical mode which is spread all over the St. It is known as the zero mode 


2rR 
é(t) = f ERIE (7.6) 


The fact that (t) does not depend on space means that there is no sense in which 
it propagates. Said another way, this just a single degree of freedom rather than the 
infinite number of degrees of freedom — one per spatial point — that are typically 
contained in a field theory. 
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The quantity (t) is gauge invariant and dimensionless. Importantly, it is also peri- 
odic. This arises from performing large gauge transformations of kind that we met a 
number of times previously. These are single valued gauge transformations of the form 
e“(*) but where w is not single valued. Instead w obeys 


wir = 2r R) = w(x = 0)+2mn for somen€ Z 


The simplest such example, with n = 1, is just w = x/R. Under such a gauge trans- 
formation, we have 


1 
Arer OR a 


Under this, or any gauge transformation with n = 1, the zero mode (7.6) transform as 
o> +27 
This is the statement that @ is periodic. 


The dynamics of ¢ follows from the Lagrangian 


1 


7 aa . 
= men | p 


As usual, the 0 term does not affect the classical equations of motion, but it does affect 
the definition of the canonical momentum p, which is given by 


1s, 0 


~ OR? T Qn 


p 


The Hamiltonian is then 


Fal Porn 0N’ 
— Ane2R = p QT 


This is precisely the problem of a particle moving on a circle in the presence of flux. 
We already met this in Section 2.2 as an analogy which captures some of the aspects 
of the four dimensional theta term. We also met it subsequently in Section 3.6 where 
we saw that it exhibits some interesting discrete anomaly when 0 = 7; we won’t need 
this fact in what follows. 
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A familiar theme now emerges: although the classical physics remains unchanged 
by 0, there is an important effect in the quantum physics. This arises because the 
wavefunctions ~ should be single valued. The energy eigenstates are y; = e”? with 
le Z. The spectrum is given by 


2 
Hah = Ehi with Er = reR ( = Z) 
T 


The spectrum is periodic in 0 as expected. For 0 € (—7,7), the ground state is / = 0. 
For 0 = +7, there are two degenerate ground states, l = 0 and l = +1. If we increase 
0 + 0 +27, then the spectrum remains the same, but all the states shift along by one. 
This is a phenomenon known as spectral flow. 


7.1.2 The Theta Angle is a Background Electric Field 


There is a particularly simple interpretation of the 0 angle in two dimensions: it gives 
rise to a background electric field. We have already noticed that, classically, the equa- 
tion of motion 0,F'° = 0 allows for a constant background electric field. In Ap = 0 


1. 0 
hy =—_ oae lope 
= mR? ° € =| 


gauge, this is given by 


Fy =e (i = Z) leZ (7.7) 


We see that the Hilbert space of pure Maxwell theory in d = 1 + 1 dimensions can 
be thought of as describing integrally spaced, constant electric fields, shifted by the 0 
angle. 


The above analysis was all performed on a spatial circle of radius R. However, the 
ultimate quantisation of the electric field (7.7) is independent of this radius. Indeed, 
there is a particularly simple way to see that the 0 angle gives rise to a background 
electric field if we work on spatial R. We return to the action (7.5) which, noting that 
the @ term is a total derivative, we rewrite as 

= [ee — Pu F” + d para, 
where the contour integral should be taken around the boundary of spacetime. Written 
this way, it looks like the insertion of a Wilson line, with a particle of charge 0/27 at 
x = —oo, together with a particle of charge —0/27 at x = +00. As we saw in the 
classical analysis leading to (7.4), this results in an electric field Fo, = —0e?/27. This 
agrees with the more careful quantum computation (7.7). 
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Our discussion above suggests that something interesting happens when 0 = 7: 
there are two degenerate ground states. These are the states (7.7) with | = 0 and 
l = +1 which have Fo, = +e70/27. If we were to change @ slowly, passing through the 
value 6 = 7, we jump discontinuously from the background field Fo, = —e?/2 to the 
background field Fo, = +e7/2. This is an example of a first order phase transition. 


Our next task is to understand what happens to our theory when we include dynam- 
ical matter. 


7.2 The Abelian-Higgs Model 


In this section, we consider a U(1) gauge theory coupled to a complex scalar field ¢. 
The action is 


S= | Pe aP + Fin + D- mlo- Sol! (7.8) 
We take the scalar field to have charge q = 1, so that D, = 0,¢ —1A,@. In two- 
dimensions, the gauge coupling has scaling dimension [e?] = 2. This means that elec- 
tromagnetism will always be strongly coupled in the infra-red unless some other physics 
kicks in at a higher scale. It will be straightforward to understand the dynamics of the 
scalar when |m?| >> e?, but harder in the regime |m?| < e?. In what follows, we will 
discuss the Abelian-Higgs model in two different semi-classical regimes: m? >> e? and 
m? < —e?. 


mÊ? > e?: For very large, positive m?, quantization of the scalar field simply gives us 
particles and anti-particles, each of mass m and charge q = +1. These particles then 
interact through the two-dimensional Coulomb force. We will call this the Coulomb 
phase. 


To start our discussion, let’s focus on the case 6 = 0. A particle of charge q = 1 gives 
rise to a constant electric field, Fy; = e?, which we take to be emitted to the right of 
the particle. If an anti-particle, with charge q = —1, sits at a distance L, as shown in 
the figure, then we are left with an energy in the electric field given by 


æL 
9 


E (7.9) 
This linear growth in energy is the characteristic of confinement. We see that, in 
d = 1 + 1 dimensions, confinement occurs rather naturally, with the electric field 
automatically forming a flux tube. Indeed, in two dimensions, the Coulomb phase is 
the same thing as the confining phase. 
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F,,=0 Fy, =e? F =0 Fy, = -2/2 F, = +e72 


q=1 q=- q=1 qe-1 q=1 
Figure 51: When 0 = 0, there is a con- Figure 52: When 0 = 7, the string ten- 
fining string between particles and anti- sions cancel on either side and alternat- 
particles ing particles/anti-particles feel no long- 


distance force. 


There is, however, a limit to how far this flux tube can stretch. If we attempt to 
separate a particle-anti-particle pair too far, then the energy stored in the string is 
greater than the energy required to create a particle-anti-particle pair, and we expect 
the string to break. This should happen for e?L/2 Z 2m or, L = 4m/e?. The upshot 
of this argument, is that we expect the spectrum of the theory to consist of a tower of 
neutral meson-like states, each containing a particle and anti-particle. The low-lying 
modes of this spectrum can be easily computed using a non-relativistic Schrödinger 
equation, although we will not do so here”. 


We could also ask how the theory responds if we insert test charges of q ¢ Z. A 
particle-anti-particle pair will, once again, be confined by the electric field Fo, = qe?. 
However, the electric field cannot be removed by pair creation of ¢ particles, since these 
can only result in a change AFo; = e°. We learn that these test particles are confined 
no matter how far they are separated. 


The story does not change much as we turn on 0, until we reach 6 = m. Now 
something more interesting can happen. Suppose that the electric field at x — —oo is 
given by Fo, = —e?/2. The presence of a particle of charge q means that the electric 
field jumps to Fo, = +e?/2. Since its magnitude doesn’t change, this particle is free 
to roam along the line. We can follow this by a chain of alternating particles and 
anti-particles, each of which is free to move at no extra cost of energy (ignoring any 
short distance forces between the particles). In this case, the particles are no longer 
confined, at least when placed with a particular ordering along the line. 


m? < —e?: With a large negative mass-squared, the scalar condenses. The minimum 
of the classical potential lies at 


m2 


2_ 
o=- 


(7.10) 


12See, for example, the discussion of the linear potential and Airy function in the lectures on 
Applications of Quantum Mechanics. 
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Our naive expectation is that we now lie in the Higgs phase, with the electric field 
screened and the charged particles free to roam at will. Rather strikingly, this naive 
expectation is completely wrong. Instead, it turns out that the physics in this regime 
is exactly the same as the physics when m? > e?. As we now explain, this is due to a 
special property of Abelian gauge theories in two dimensions. 


7.2.1 Vortices 


The new ingredient is the existence of vortices. These are solutions to the equations 
of motion that exist when the theory is formulated in the Euclidean space. These 
same vortices were discussed in Section 2.5.2, where they arise as string-like solutions 
in d = 3+ 1 dimensions. In contrast, these same solutions will now be localised in 
spacetime; they play a role similar to the instantons discussed in Section 2.3 although, 
as we shall see, their effect is arguably more profound: they destroy the long-range 
order (7.10). 


To see this, let’s first formulate the action in Euclidean space. We write the action 
(7.8) as 


1 i0 A 2 
= 2 2 2 2 3 
where now i = 1,2. We have written the Higgs vev as v? = —m?/X. A finite action 


configuration requires |¢| — v as r — oo. The provides us with some interesting 
topology: the asymptotic S,, of Euclidean spacetime is mapped into the St defined 
by |¢| = v. Mathematically, this means that field configurations are characterised by 
I (St) = Z, in which the phase of @ winds asymptotically. For example, we may take 


b> ely (7.12) 


where @ is the polar coordinate on the spatial R?. This is single valued for n € Z. This 
integer n is called the winding. Configurations with n > 0 are called vortices; those 
with n < 0 are anti-vortices. 


However, a scalar that winds in this way has infinite action unless it is also accom- 
panied by non-vanishing gauge field. This is because the gradient terms are given 
by 


1 o p2 
fer |ð:p|? = fwar ‘is -lð +... = an | dr a 
0 


which is logarithmically divergent. We see that the trouble arises because the gradient 
terms fall off too slowly, as 1/r. To compensate for this, we must turn on a gauge field 
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A;, such that D;¢ = 0;¢ — i A;¢ falls off at a faster rate. For a configuration that winds 
as (7.12), this ensures that the gauge field must take the asymptotic form Ag > n/r 
which, in turn, tells us that vortices are accompanied by a quantised flux 


> dx Fio = ~ fa rAg=n (7.13) 
One can construct solutions to the equations of motion with this asymptotic behaviour 
by working with an ansatz of the form ¢(x) = e'"’g,(r) and Ag = nf,(r), where 
the radial functions g,(r) and f,,(7) the second order differential equations subject to 
certain boundary conditions. The exact form of these solutions will not concern us 
here: all we need is the statement that solutions always exist for n = +1. In this 
solution, the flux is restricted to a region of size 1/ev, while the scalar field deviates 
from the vacuum over a region 1/VXv. We’ll denote the vortex size, a, by the larger 
of these two scales, 


(7m) 

a = max | —, —— 

ev’ Ju 

We will also denote the real part of the action for a single, n = +1, vortex as S\ortex- 


Because the vortices come with flux (7.13), their contribution to the path integral will 
have the characteristic form 


e7 Svortex tid /27 


where the + sign distinguishes a vortex from an anti-vortex. 


So much for solutions with n = +1. What about vortices with higher winding? It 
turns out that solutions exist for higher n, but only when A < e°. Nonetheless, we 
shall not make use of these solutions. Instead, it will suffice to consider a dilute gas of 
n = +1 vortices separated by distances > a. 


Summing over Vortices 


Let’s start by computing the partition function, 


z|o] = I DAD¢ exp (-Sr[A, 6) 


As always, the partition function depends on the parameters, or sources, of the ac- 
tion. As the notation suggests, we will be particularly interested in the dependence on 
the theta angle. In the semi-classical approximation, this path integral gets contribu- 
tions from the (approximate) solutions of far-separated vortices and anti-vortices. The 
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strategy for performing these kinds of calculations was sketched in Section 2.3.3 in the 
context of the double well potential in quantum mechanics. The contribution from a 
single vortex takes the schematic form 


Viger (|| = VK e` Svortex 10/20 


Here V denotes the volume of spacetime (which, of course, is really an area since we are 
in two dimensions). This factor comes from the fact that the vortex can sit anywhere. 
V is, of course, infinite if we work on R? but it will prove useful to consider it finite 
for now. The factor K comes from computing the one-loop determinant contribution 
around the background of the vortex; it will depend on parameters such as e?, v? 
and à but its precise form will not be important for our needs. Finally, we have the 
characteristic exponential suppression of the vortex. Similarly, for an anti-vortex we 
have 


Zantivortex l0] = V Re vortex—10/2n 


For our final expression, we sum over a dilute gas with all possible combinations of p 
vortices and p anti-vortices, to get 
1 a ae 

z|o] = > — (VK e~Svortex) PtP eilp—)6/2m — exp (vx eT vortex cog 0) (7.14) 
~ pip! 

P,P 
What physics can we extract from this? First, this result tells us how the ground state 
energy varies as a function of 0. For this, we need to recall the interpretation of the 
partition function as a propagator between states, 


Z(0] = (Ole ** 0) = Ble" |6) 
If we write V = LR, with T the size of the temporal direction, and R the radius of the 


spatial direction, then we find the ground state energy density 
Eo(0) 
R 


= —2K e Pvt cosg (7.15) 
We can also compute the expected value of the background electric field. This is 


2ri O 
I EA E E E 
Fi) = 756 
The fact that the right-hand-side is imaginary should not concern us; after Wick ro- 


log Z[0] = 4riK evr sind 


tating back to Lorentzian signature, we get the result 

(Fo) = 4r K eoe sin 8 
We see that turning on a 0 angle once again induces a background electric field. Ad- 
mittedly, there are some differences from the case of pure electromagnetism (7.7) or, 


indeed, the case of m? >> e?. In particular, the electric field is maximum at 0 = 7/2, 
rather than 0 = 7. 
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Classically, the energy density in the electric field is proportional to Få. Quantum 
mechanically, the energy density (7.15) is not proportional to (F,)*; instead, it is 
proportional to (F) ~ 07/00? log Z. This is telling us that there are large fluctuations 
in the electric field. At 6 = 7, it is these fluctuations which are contributing to the 
energy, even though (Fo) = 0. 


Note in particular, that when 0 = 7, there is a change in the vacuum structure: when 
m? > e?, there were two values for the electric field, (Fo) = +e?/2, while for m? < e? 
there is just one, (Foi) = 0. This behaviour is characteristic of a phase transition and 
we will return to it shortly when we sketch the phase diagram of the theory. 


7.2.2 The Wilson Loop 


We can now address our main question of interest: when m? < —e?, are charged 
particles screened, as one would expect in a Higgs phase? To answer this we use the 
Wilson loop, introduced in Section 2.5.3, describing the insertion of a particle with 
charge q, and an anti-particle with charge —q, 


Wide (i $ A) (7.16) 


Here C is the rectangular loop; the particle and anti-particle are separated by a spatial 
distance L, and propagate for time T”. We will take each of these distances to be much 
larger than the size of the vortices, so L, T’ >> a, but much smaller than the size of our 
universe, so L < Rand T’ & T. 


We would like to compute the expectation value of the Wilson loop, 
1 
(WICI) =z | DADs WIC] exp (-SelA, 4) (7.17) 


But this is particularly simple in the semi-classical approximation. First, we assume 
that we can divide all (anti) vortices into those inside the loop C, and those outside. 
This ignores those vortices that happen to overlap with the curve C, but these should 
be negligible when C is large. In the semi-classical approximation, the expression (7.17) 
decomposes into two pieces; one from inside the loop and the other from outside the 
loop, 
[PA Do W[C] exp (—Sz[A, 4]) = Zinside lo] Topel l6] 

The contribution from outside the loop is given by our original expression for Z[0] 


(7.14), but with the area of spacetime V reduced by the area of the loop, 


Z outside [6] = exp (20 = LTK e7 Svortex cos 0) 
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Meanwhile, the Wilson loop affects only the contribution Zinsiae from inside the loop. 
In a given background, the Wilson loop (7.16) simply counts the total winding number, 
v = #(vortices) — #(anti-vortices) in the loop. 


WIC] = 


Comparing to the expression (7.14), we see that the Wilson loop effectively shifts the 
theta angle 0 — 0 + 27q. We therefore have 


ZinsiaelO] = exp (21T'K e7 Svortex cos(ĝ + 2nq)) 
Combining these results, the expectation value of the Wilson loop becomes 
(W|C]) = exp (21T'K e™Svonex [cos(0 + 27q) — cos 6] ) 


Our task now is to interpret this result. First notice that, for q ¢ Z, the Wilson loop 
exhibits an area law, telling us that the charges are confined. The string tension is 
given by the energy density 


- = 2K ees [cos(0 + 2rq) — cos 0] (7.18) 
This is already surprising, since it disagrees with our naive expectation that all charges 
should be screened in the Higgs phase. Instead, charges q ¢ Z are confined, just as 
they are in the Coulomb phase with m? > e°. In contrast, the string tension vanishes 
for q = 1. But, this too, agrees with the Coulomb phase picture, where pair creation of 
@ particles results in the string breaking, and the test particles forming gauge neutral 
meson states. 


We learn that, in the d = 1+1 Abelian Higgs model, there is no qualitative distinction 
between the behaviour of the theory at m? > e? and m? < —e?. In both cases, the 
charged particles are confined. The only difference is a quantitative one: the string 
tension (7.18) is exponentially suppressed when m? < —e?, compared to its value (7.9) 
when m? > e?. 


The Phase Diagram of the Abelian Higgs Model 


The discussion above strongly suggests that there is no phase transition as we move 
from Mm? > e? to m? < —e?: the would-be Higgs phase is washed away by vortices, 
leaving us only with the Coulomb phase. 
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Figure 53: The phase diagram of the 2d Abelian Higgs model 


However, there is one remaining subtlety, which occurs at 0 = m. As we saw above, 
there are two degenerate ground states, (Fo,) = +e? /2 when m? > e?, with a first order 
phase transition between them as we vary 0 through z. In contrast, there is a unique 
ground state (F1) = 0 when m? < —e?. This line of first order phase transitions must 
end somewhere. The simplest possibility is that it ends at a critical point at some 
value of the mass, presumably around m? ~ —e?. Since the order parameter, Fo, is 
a parity-odd real scalar, it is natural to conjecture that this critical point is described 
by the d = 2 Ising CFT. The resulting phase diagram for the d = 1+ 1 Abelian Higgs 
model is shown in the figure. 


(As an aside: The story above is similar, but ultimately different, from the story 
from the XY-model in d = 1+ 1 dimensions. This theory describes a complex scalar 
without the associated gauge field and was discussed in the lectures on Statistical Field 
Theory. Once again, vortices play an important role, but this time they induce the 
Kosterlitz-Thouless phase transition. ) 


7.3 The CP’! Model 


We now turn to a theory that is closely related to the Abelian Higgs model. It consists 
of N complex scalars, ¢a, a = 1,...,.N, each coupled to a U(1) gauge field with charge 
q= +1. 


2 


Our interest will lie in the theory where all scalars have negative m^ so, following 


(7.11), we write the action in Euclidean space as 


2 1 2 0 ~ 2 À ~ 2 2 
S= | Px Fh + Fat) Diha +h D aP) (7.19) 
a=1 a=1 
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Note that our theory has a SU(N) global symmetry, acting in the obvious way on 
the ġa. This will be important below. As always, we would like to ask: what is the 
low-energy physics? This arises in the limit e? > oo and À > oo. 


We can first look classically. At low-energies, the scalars sit in the minima of the 
potential, 


N 
5S lal? = v? (7.20) 
a=1 


S?N-1 sphere of radius v. 


This restricts the values of the complex ¢ fields to lie on a 
But we still have to divide out by gauge transformations. These identify configurations 


related by 
pa > €' ba 
We’re left with scalar fields ¢, which parameterise the manifold, 
g= U (J= CP 


The manifold CP’! is known as complex projective space; it can be equivalently 
defined as the space of all complex lines in C™ which pass through the origin. CP‘~! 
has real dimension 2(N — 1), or complex dimension N — 1, and should be thought of 
as the complex analog of a round sphere, with the SU(N) global symmetry descending 
to an isometry of CP’~'. 


To proceed, we could choose to parameterise the ¢, by coordinates X™ on CP*~?. 
Plugging this back into our action would result in a non-linear sigma model of the kind 


s= f PE Gmn(X) 0X” 9X" (7.21) 


where gmn(X) is the metric on CP*’~'. (There is an additional term coming from the 
theta angle that we will discuss below.) For our purposes, however, it will prove more 
useful to work with the action (7.19); this form of the action is sometimes referred to 
as a gauged linear sigma model. 


Classically, we learn that our CP’~! model describes N — 1, interacting, massless 
complex scalars. These are Goldstone modes. Indeed, picking a solution to (7.20) 
breaks the global SU(N) symmetry to SU(N — 1) x U(1), and the target space CP*~! 
can equivalently be written as the coset space 

SU(N 
SU(N — 1) x U(1) 


Ce = 
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The interactions between the Goldstone modes are determined by the coupling v7, 
which is the size of CP‘! or, more pertinently, the inverse curvature. This means 
that the theory is weakly coupled when v? > 1, and strongly coupled when v? < 1. 
However, as we should now expect: we don’t get to choose, since quantum fluctuations 
will cause v? to change as we flow towards the infra-red. Do we flow to weak coupling 
or strong coupling? As we will see below, the answer is that we flow to strong coupling: 
the CP‘! sigma model in two dimensions is asymptotically free. 


7.3.1 A Mass Gap 


Rather than compute the beta function for v?, we will instead jump straight to figuring 
out the low-energy dynamics. This will give us the interesting information that we care 
about and, indirectly, also allow us to extract the beta function. 


We’re interested in the low-energy limit, e?, A — oo. We force the fields to live in the 
minima (7.20) by using a Lagrange multiplier constraint, and replace the action (7.19) 
with 


N N r 
S= fex D Pedal? +io( Veal? — 1°) + Ti (7.22) 


where o is now a dynamical field. Note that o comes with a factor if i because we want 
it to impose the constraint (7.20) as a delta function. This will result in some strange 
looking factors of i in the effective potential below. However, upon Wick rotating back 
to Lorentzian signature, 9 — io and everything looks nice and real again 


We have succeeded in writing the path integral so that the a occur quadratically. 
They can now be happily integrated out, and we’re left with the partition function, 


Z= f DADoD¢D¢* e? = f DADo e~ 8 
with 
See = N tr log ( — (ði — i A;) + io) — ifa x € o+ Zr) (7.23) 
T 


The problem is that we’re now left with a very complicated looking path integral over 
the auxiliary A and ø. In general, this is hard. However, some respite comes from 
the factor of N in front of the first term, which suggests that one can evaluate the 
integral using the saddle point in the limit M — oo. The is rather similar to the large 
N expansion that we met in Section 6 for Yang-Mills. It turns out, perhaps reasonably, 
that theories like the CP’! model, where the number of fields grows linearly with N, 
are much easier to deal with than Yang-Mills, where the number of fields grows as N?. 
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To proceed, we will first restrict to configurations with A; = 0, and extract an 
effective potential for the constant value of the auxiliary scalar ø. The trace above is 
an integral over momentum, 


Valo) =N f oS log(k? + 40) — iv? 
eft (7) = (On) og io) — iv*o 


2 


The integral is divergent and requires us to introduce a UV cut-off Ayy. Performing 
the integral then gives 


N io + A? 1 io + A? 
a = — lig] UV \ 4 A2] UV 2 
Vi (a) in ia o ( a ) os ( rom iv o 


N i 
= [1 — tog (zz) —ivot... (7.24) 


where, to reach the second line, we’ve Taylor expanded in o/Aj,,, and the ... include 


constant terms and terms which vanish as ae > Oo. 


We still have to do the path integral over o and that will, in general, be hard. 
However, the overall factor of N provides a glimmer of hope, because it means that the 
integral will be dominated by the saddle point in the N —> oo limit. This saddle point 


Ver _ 9 5 Ni 10 —_,2 
ðo dn © Ne = 


4 2 
=> io = Agy exp (- n ) (7.25) 


There are a number of different lessons to take from this. First, note that the CP’! 


model has undergone the phenomenon of dimensional transmutation that we saw in 


is given by 


Yang-Mills theory. The original Lagrangian (7.19) has only dimensionless parameters 
(at least, this is true after we have sent e? — oo). Nonetheless, the theory generates a 
physical dimensionful scale, arising from the UV cut-off Ayy in the partition function, 


2 
Agpy-1 = Avy exp (- a ) (7.26) 


The scale Agpwn-1 is entirely analogous to Agcp (2.59) that arises in Yang-Mills. While 
the cut-off Ayy is unphysical, the low-energy Agpw-: is the scale at which interesting 
physical things can happen. This is sensible only because the dimensionless coupling 
v? runs under RG. In (7.26) the coupling should be thought of as being evaluated at 
the cut-off, v? = v? (Ayy). More generally, the physical scale is written as 


2) 


Agpn-1 = H exp (- N 
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From the requirement that this physical scale is invariant RG we can extract the beta- 
function for v?, 


dMgpn-1 _ dv? N 


0 = T21, 
du = du 2r ( ) 


This tells us that v? gets smaller as we flow towards the IR (small jz). From our previous 
discussion, we know that this is the strong coupling limit of the CP^™~t model. In other 
words, this beta function tells us that, just like Yang-Mills, the CP’! model is strongly 
coupled in the IR, and asymptotically free in the UV. 


Although the physics very much parallels that of Yang-Mills theory, it’s worth point- 
ing out the logic of our derivation is somewhat different. For Yang-Mills, we started off 
by computing the one-loop beta function and, from that, extracted the physical scale 
Agcp. For the CP"! model, our discussion ran the other way round. Both are valid. 


So far, we’ve figured out that there is a dynamically generated scale Agpny-1. But 
what happens at this scale? To see this, we need to note that, from (7.25), we have 
ig = AF pna But substituting this into (7.22), we see that an expectation value 
for ø acts as a mass term for our original fields ¢,. In other words, the 2d CP‘! 
sigma model is not a theory of massless Goldstone modes at all! In the quantum 
theory, these massless modes pick up a mass given by Agpy-1. Moreover, the SU(N) 
global symmetry is restored at low-energies. This is an example of the Mermin- Wagner 


theorem which states that there can be no Goldstone bosons in two dimensions!”. 


Once again, we see the close analogy with Yang-Mills. Both theories appear massless 
but actually have a gap. The difference is that we can actually show this for the CP’! 
model. 


7.3.2 Confinement 


So far we have ignored the role of the gauge field in the effective action (7.23). At 
leading order, the effect of integrating out the scalars a is captured by two Feynman 


13We met another example of the Mermin-Wagner theorem in the lecture notes on Statistical Field 
Theory. There we discussed the O(N) model, a non-linear sigma model with target space SY; it is the 
real version of the CP’~! model. Indeed, the first two models in each class coincide at the bottom 
of the list, since CP’ = S3. After this, the models differ. In particular, the CP*~' models have 
instantons for all N, while the O(N) models do not for N > 4. Nonetheless, the two classes of models 
share the same fate. Both are gapped at low-energies. 
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diagrams: 


me ee Si 


These generate a Maxwell kinetic term 


N 


ABT? xeon 


Sat = Fig” 

Note that we started with a Maxwell term in our original action (7.19), but sent e? > 
oo. This was to no avail: we generate a new term at one-loop, now with a coefficient 
that is comparable to the mass gap in the theory. 


The upshot of our discussion is that low-energy physics of the CP’! model is that 
of N massive scalars, each with mass m = Agpn-1, interacting through an unbroken 
U(1) gauge field. As we saw in Section 7.1, electromagnetism gives rise to a linear, 
confining force between charged particles in two dimensions. The original scalars $*% 
transform in the N of the SU(N) global symmetry. We learn that not only are these 
now massive, but they are also confined. The physical spectrum of the theory consists 
of massive, SU(N) singlets. These are mesons, constructed from ¢ and ¢*. 


7.3.3 Instantons 


The low-energy physics of the CP! model is very similar to that of the Abelian 
Higgs model that we met in Section 7.2. In both cases, the quantum theory eschews 
the Higgs phase, and the fundamental excitations are confined. Yet the way we reached 
these conclusions is rather different. For the Abelian Higgs model, we placed the blame 
firmly on the instantons (which we identified as vortices); for the CP’~' model, we 
reached the same conclusion but using the large N expansion. 


We could ask: are there instantons in the CP’~! model? And, if so, what role do 


they play? 


The answer to the first question is: yes, the CP! model does have instantons. 
There are actually two different ways to see this. If we start with the gauged linear 
model (7.19), then the instantons again arise as vortices. (Vortices with more than one 
scalar field sometimes go by the unhelpful name of “semi-local vortices”.) They are 
labelled by a winding number 


1 
n= dx Fy (7.28) 
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Alternatively, if we work with the non-linear sigma-model (7.21), these instantons show 
up in a rather different guise. Here field configurations are a map from spatial R? > 
CP^-!. However, we must first choose a point on the CP’ target space which is 
the vacuum. This choice breaks the SU(N) symmetry down to SU(N — 1) x U(1). 
The requirement that the fields asymptote to this vacuum point at spatial infinity 
means that field configurations are really a map from S? +» CP%~!, and these are 
characterised by the winding number 


II, (CP) =Z 


This winding is given by 


1 * 
n = m dr Oni € ics (OOO ) 
One can show that this coincides magnetic flux (7.28) using the equation of motion for 
A, from (7.19). 


These instantons have a number of interesting properties. One can show that their 
action is given by 


Sinstanton = Qrv? (7.29) 


The scale invariance of the classical 2d sigma model means that the instantons cannot 
have a fixed size. Instead, like their Yang-Mills counterparts discussed in Section 2.3, 
they have a scaling modulus. There are also further moduli that describe how the 
instanton is oriented inside CP‘~'. In all, the single instanton has 2N parameters, 
which decompose into two position moduli, a scaling modulus, and 2N —3 orientational 
moduli. 


We now come to the second question: what role do these instantons play in deter- 
mining the low energy physics? For N > 2, the answer is: surprisingly little. This 
can be seen, for example, by comparing the mass scale (7.26) to the instanton action 
(7.29), 

Acgpn-1 = Auv e7 Sinstanton/N 
This factor of N is important: it is telling us that the instantons are not responsible 
for the mass gap in the CP’! model. 
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The issue here is that, as we have seen, the CP‘! model is strongly coupled, and 
it is not appropriate to try to employ semi-classical techniques like instantons. Indeed, 
the existence of instantons hinges on the fact that we pick a vacuum state on CP’! 
which, in turn, spontaneously breaks the SU(N) global symmetry. Yet, the large 
N expansion tells us that this is a red herring: in the quantum theory the SU(N) 
symmetry is restored. The true ground state does not involve a preferred point on 
CP*~!, but rather a wavefunction that spreads over the whole space. As such, the 
role of instantons in this theory is limited when it comes to determining the infra-red 
physics. The same lesson is expected to hold in Yang-Mills. 


The Theta Angle 


So far we have not discussed the role of the theta angle in the CP‘! model. There is 
something interesting here. For N > 3 (e.g. for CP? or higher) it is thought that, while 
the theta angle affects the spectrum of the theory, it does not change the phase and the 
theory remains gapped for all 0. However, for CP', something special happens. Here, 
the theory is thought to be gapped for all 6 4 m. At 0 = 7, the theory is expected to 
be gapless, with the low-energy physics described by an SU(2),; Wess-Zumino-Witten 
model. This is sometimes referred to as the Haldane conjecture. 


7.4 Fermions in Two Dimensions 


It’s now time to look at fermions. In this section, we will describe a theory that consists 
only of interacting fermions. In d = 3+1 dimensions, such theories are not particularly 
interesting because the simplest interaction — a four fermion term — is irrelevant. This 
is no longer the case in d = 1 + 1 dimensions and, as we will see, even the simplest 
theories of interacting fermions are strongly interacting and, like the CP’~! model 
above, share a number of surprising properties with QCD. 


We start by reviewing some basic facts about fermions in d = 1+ 1 dimensions. The 
Clifford algebra {y",y’} = 2n"” is satisfied by 2 x 2 matrices. Working in signature 
nev = diag(+1,—1), we take the gamma matrices to be 


=o and yt =i? > =y So? (7.30) 


Here 7° plays the same role as 7° in d = 3+1 dimensions. It is an extra, anti-commuting 
matrix which can be used to decompose the two-component Dirac fermion as 


e-d) 


Here w+ are 2d Weyl spinors; they are eigenstates of 7°. 
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Fermions in d = 1 + 1 dimensions have the special property that they can be both 
Weyl and Majorana at the same time. This follows because the chiral basis of gamma 
matrices (7.30) is also real. (In contrast, in d = 3+ 1 dimensions you can pick a 
real basis of gamma matrices but it is not chiral, or a chiral basis which is not real.) 
This means that we can decompose the Dirac fermion as w = x; + 7x2 and, moreover, 


decompose each Weyl fermion as Y4 = X14 + iX2+. In what follows, we won’t need this 
Majorana decomposition until section 7.4.2. 


The action for a free Dirac fermion is 


= je iw Ob — mo (7.31) 


= je ipla p, +t dg_— mld, +o) 


where we have introduced lightcone coordinates r= = t + and 0, =O, +0,. 


For a massless fermion, with m = 0, the two Weyl spinors decouple, with equations 
of motion 


Ou -=0 => y- =4- (a) and OY,=0 Syy,=¥4(2") 


We learn that the chiral fermion y_ is a function only of x~. In other words, w_ is a 
right-moving fermion. Similarly, Y, is a left-moving fermion. Since the fermions are 
massless, each moves at the speed of light. 


In d = 3+1 dimensions, interactions between fermions are always mediated by gauge 
or scalar fields. In d = 1+ 1 dimensions we have a more direct possibility. The fermion 
field has dimension [y] = 1/2 which means that four fermion term (Yy)? is marginal. 
We can ask: how does this change the low-energy physics. In fact, as we discuss, there 
are two different ways of adding four fermion terms. 


7.4.1 The Gross-Neveu Model 


The Gross-Neveu model describes N, classically massless Dirac fermions, p;i, i = 
1,..., N, with a four fermi interaction. The action is given by 


= À -> 
2N 
Here A is a dimensionless coupling. We have included the factor of N in anticipation 


of the fact that we will solve this theory in the large N limit. 
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The action has a manifest U(1)y x SU(N) flavour symmetry, under which the 
fermions transform as N4. In fact, if we decompose each Dirac fermion into two 
Majorana fermions, the symmetry group is actually O(2N) symmetry, and this will 
play a role shortly. There is also a discrete Zə chiral symmetry, 


Zo: Wi > Phi (7.33) 


Importantly, a would-be mass term is odd under this discrete chiral symmetry, ~);1); > 
—w;;. This means that the existence of the Zz symmetry would naively prohibit the 
generation of a mass. Our goal is to see how this plays out in the quantum theory. 


It turns out that life is easier if we introduce an auxiliary scalar field, ø, and write 
the action as 


Although o is dynamical, we do not include a kinetic term for it. We can integrate it 
out by imposing the equation of motion 


o = a J 


and we get back the original action (7.32). The new form of the action (7.34) is again 
invariant under the discrete chiral symmetry, but only if we take o to be odd, 


Zo: C= =O 


The introduction of ø is reminiscent of the auxiliary field that we introduced in the 
CP! model. Indeed, we will proceed by following the same strategy. We will inte- 
grate out the fields that we thought we cared about — in this case the fermions — and 
focus on the resulting effective dynamics for ø. We will see that this is sufficient to 
teach us the relevant physics. 


Integrating out the fermions leaves behind the following effective action for ø, 


Ser = iN log det (id + o) — jez ~ 


We can write the first term in more concrete form. First, 


det (iĝ + o) = det (iĝ) det (1 = TE 
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and we neglect the factor det(i@) on the grounds that it contributes an irrelevant 
constant. The next step is to deal with the gamma matrix structure in the second 
term. Using det 7? = —1, we have 


det (1 = ipo) = det (a = T = det (1 + igo) 
Multiplying these together then gives 
det ( — i'o) = det 1/? (1 + (f"'c)?) = det /? (1-7 da) 


where the argument in the final argument comes with a 2 x 2 unit matrix for the spinor 
indices. But this simply changes det!/? back to det. Finally, we use log det = Tr log to 
write 


Seg = iN Tr log (1 — o 0 a) — fex so? 
This action doesn’t look particularly appealing. But it has one important feature going 
for it, which is that it’s proportional to N. This means that in the large N limit it can 
be evaluated using a saddle point. We look for solutions in which o is constant. In this 
case, the annoying log factor can be replaced by a simple integral, leaving us with the 
effective potential for the scalar field. Rotating to Euclidean space, we have 


^ov g N N 
p a 3 
Ver(o) = N se” igs be |) ee 

slo) J (27)? "8 ( a À g 


This is the same kind of integral that we met in (7.24) when solving the 2d CP‘! 
model. The same method that we used previously now gives 


N o’? N 
Vlo) = —o? (108 ( ) — 1) + —o? (7.35) 
An Ne gs A 


In the large N limit, the path integral is dominated by the minimum of the potential 


which sits at 


o Vor 
Oo 


= 2 — 2 —2r/A 
=0 > of =Ajzye / 


We learn that the o field gets an expectation value. The theory was originally invariant 
under the discrete chiral symmetry, 0 + —a, but this is spontaneously broken in the 


ground state: the theory must choose one of the two ground states o = +Ayy e777. 


— 354 — 


With the protective Z symmetry spontaneously broken, there is nothing to stop the 
fermions getting a mass. Indeed, substituting the expectation value of ø back into the 
action (7.34), we find that the mass is given by 


mon = Ayye "^ (7.36) 


Once again we have the phenomenon of dimensional transmutation: the dimensionless 
coupling À has combined with the UV cut-off to provide a physical mass scale of the 
theory. Once again, we thought that we started out with a theory of massless particles, 
but the interactions find an ingenious way to generate a mass. 


Above we have phrased the physics in the terms of the effective potential. Another 
approach would be to compute one-loop contributions to the running of the coupling. 
We would have found that the theory is asymptotically free, with the beta function 


Phrased in this way, the physical mass is seen to be RG invariant, as it should be: 


7.4.2 Kinks in the Gross-Neveu Model 


As we’ve seen, the Gross-Neveu model spontaneously breaks the Zə symmetry. This 
means that the theory has two degenerate ground states, distinguished by the sign of 
o = +Ayye "À. This gives us a new state in the the theory: a kink which interpolates 
between the two ground states, so that the profile of a(x) obeys 


o > +Ayye 7 as x — +00 


We would like to understand what properties these kinks have and, in particular, how 
they transform under the symmetries of the theory. The key to this is to see what 
happens to the original fermions in the presence of the kink. 


The Dirac equation from (7.34) is 
i fyi + ov; =0 


We'd like to solve this in the kink background. You might think that this is tricky 
because we haven’t determined the profile a(x) of the kink. Fortunately, this isn’t a 
problem, because the property that we need is robust and independent of the exact 
form of (x): this is the existence of a fermi zero mode. 
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We met fermi zero modes on domain walls previously, both in our discussion of 
topological insulators in Section 3.3.4 and lattice gauge theory in Section 4.4.1. The 
analysis needed here is exactly the same, and we won’t repeat it. But the upshot is 
that each fermion Y; has a single, complex fermi zero mode on the kink. 


At this point, it is important to recall that our Dirac fermions can be decomposed 
into Majorana fermions, which we write as 


Wi = Xi + 1Xi4N 1=1,...,N 


The existence of Majorana fermions means that the global symmetry of the Gross- 
Neveu model is O(2N) rather than U(N). Each of these Majorana fermions gives rise 
to a single, Majorana (i.e. real) fermi zero mode on the kink which we will denote as 
b;. These obey the commutation relations, 


To convince yourself that these are the right commutation relations, we can pair 
the Majorana modes back into their complex counterparts c; = (bi + ibjsn), with 
i =1,...,N which, from (7.37), obey the usual Grassmann creation and annihilation 
commutation relations {c;,c;} = 0 and {c, ch} = 2ð;j 


The commutation relations (7.37) are familiar: they are simply the Clifford algebra 
in D = 2N dimensions. This has a representation in terms of 2% x 20 dimensionsal 
matrices. Said in a different way, the Majorana zero modes ensure that the Hilbert 
space of kink excitations has dimension 2^. 


This 2" dimensional Hilbert space does not form an irreducible representation of the 
O(N) symmetry group. Instead, it decomposes into two chiral spinors. We achieve this 
by introducing the “45” matrix, y° = ib; . . . ben which obeys {¥°, b;} = 0 and (9°)? = 1. 
The two reducible representations are distinguished by the eigenvalue under 7? = +1, 


and have dimension 2%~! 


The upshot of this analysis is rather nice. We started with Majorana fermions trans- 
forming in the 2N-dimensional vector representation of O(2N). But the interactions 
generate new solitonic states. These are kinks which transform in the left and right- 
handed spinor representations of O(2N).This can be thought of as a version of “charge 
fractionalisation” . 


= 356 = 


Our results in this section used the large N approximation to determine the fate 
of the Gross-Neveu model. One might wonder if the kinks survive to small N. It 
turns out that for N > 2, both kinks and fermions exist in the spectrum. But, perhaps 
counterintuitively, when N = 2 only kinks, in the spinor representation of O(4), survive; 
the original fermions no longer exist. For N = 1, the Gross-Neveu model coincides with 
the Thirring model and turns out to be free. We will discuss this case in Section 7.5. 


An Odd Number of Majorana Fermions 


So far, our discussion of the Gross-Neveu model has focussed on N Dirac fermions or, 
equivalently, 2N Majorana fermions. But there’s nothing to stop up writing down the 
action for an odd number of Majorana fermions x;, 


S= je iix — N 4? + OXiXi 
4A 
where the summation is over i = 1,..., N. When N = 2N, this reduces to our previous 
action (7.34) in terms of Dirac fermions. When Ñ is odd, our previous analysis goes 
through unchanged, and we again find that the Zə is spontaneously broken, resulting in 
two degenerate ground states. The only novel question is: what becomes of the kinks? 


The Majorana zero modes again give rise to a Clifford algebra (7.37), but this time 
it’s a Clifford algebra in D = Ñ dimensions, with N odd. There is a single reducible 
representation which has dimension Q(N—-1)/ 2 and one might think this is the Hilbert 
space of the kinks. However, there is another discrete symmetry that we have to take 
into account. This is y; ~ —y; which is part of the O(N) group, but not SO(N). To 
implement this, we introduce the fermion parity operator (—1)* which obeys 


(1) xi (1) = -yx > {(-1)",b;} =0 


When N = 2N is even, the operator 7° can be identified with (—1)®. But when Ñ is 
odd, there is no action of (—1)” on a single irreducible representation of the Clifford 
algebra. Instead, we need two irreducible representations: one with (—1)" = +1 and 
one with (—1)" = —1. This means that for Ñ odd, we again have two irreducible 
representations of O(N), and the total number of kink states is 2 x 2-0/2, 


7.4.3 The Chiral Gross-Neveu Model 


There is a variant on the Gross-Neveu model that introduces yet another ingredient 
into the mix. First, consider the action of the axial symmetry 


Ula: po ey 
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There are two real, fermion bilinears that we can introduce: ww and ipy’y. Neither of 
them is invariant under the axial symmetry. Instead, each rotates into the other. We 
can form the complex combination Yy + wy, and this transforms as 


Ula: opted e” (by +p) 
pyp -py > e™ (bp — ph) 
This transformation motivates us to consider the following theory of N massless, inter- 
acting Dirac fermions, 


7 Mo. _ 
Svan = je ipi Ppi + oN (hipi)? = (it) (7.38) 


The advantage of this set-up is that the theory is protected from generating a mass 
term by the continuous U(1), chiral symmetry, as opposed to the discrete Z chiral 
symmetry of the original Gross-Neveu model. 


This is an important distinction. We saw above that the discrete Zə symmetry 
proved ineffectual at protecting the Gross-Neveu model from developing a gap because 
it was spontaneously broken. However, there is a general theorem, due to Mermin and 
Wagner, that says it is not possible to spontaneously break continuous symmetries in 
d = 1 + 1 quantum field theory. We met this theorem in the lectures on Statistical 
Field Theory; its essence is that infra-red fluctuations of fields always destroy any long 
range order. 


Given this theorem, you might think that the existence of a continuous chiral sym- 
metry would be much more powerful and protect the fermions from developing a gap. 
You would be wrong. As we now show, the Mermin-Wagner theorem not withstanding, 
the chiral Gross-Neveu model (7.38) also generates a gap at low energies. 


To see this, we use the same trick as before but this time introduce two auxiliary 
fields, ø and a. The action (7.33) can be written as 


E N 7 
— je ipi Ow; = 5 = T’) + pilo + ing? (7.39) 
The equation of motion for ø and z then tell us that 


otin = phill FPW (7.40) 


The action (7.39) remains invariant under U(1),4 provided that the auxiliary scalars 
transform as 


U(1)a : 0o +ir ae (co + in) (7.41) 
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Evaluating the fermion determinant in the same way as before, we find 
det (1 =ip (ot iny?)) = det /? (1 + (0,0)? + (O.7)*) 


Viewing both o and 7 as constants, we’re then left with the effective potential, 


N 24 7? N 
This is identical to the potential (7.35) for the original Gross-Neveu model, but with 
a? replaced with o? + 7”. Note, in particular, that the potential is invariant under the 
U(1), action (7.41) as it should be. 


What do we do with this potential? Because we’re in d = 1 + 1 dimensions, we 
should be a little careful. We parameterise the complex scalar field as 


o+in = pe”? 
The minimum of the potential sits at 
P = MGN 


where many is the same dynamically generated mass scale (7.36) that we saw in the 
previous model. This is already sufficient to tell us that the fermions generate a mass. 


The care is needed when we come to the angular field mode 0(x). This transforms 
as 0 + 0 +a under the U(1)4 symmetry. If we were in a higher dimension, we would 
argue that 0(x) should take some fixed value in the ground state, breaking the U(1)4 
symmetry. In such a situation, we would identify the Goldstone boson as 0, which 
necessarily remains gapless. 


However, in d = 1+1 dimensions the story is a little different. As we mentioned above, 
the Mermin-Wagner theorem tells us that there are no Goldstone modes. Instead, the 
ground state wavefunctions is closer in spirit to quantum mechanics, spreading over 
all values of 6. This is a topic that we discussed in some detail in the lectures on 
Statistical Field Theory in the context of the Kosterlitz-Thouless phase transition. We 
will recount the important facts here. The key result is that while 6 does indeed remain 
massless, it is not a Goldstone boson. This is not merely a matter of terminology: the 
physics differs. 
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First, we need to work a little harder in expanding the effective action. The potential 
is 


N 
Seog = iN Tr log (i ð+ pe”) a jez P 


It’s no longer sufficient to focus on constant values of o and m since the resulting 
potential will not depend on 0. Instead, we need to consider slowly varying 0. The 
leading term in the effective action is the obvious one: 


N 
_ 2,, ty 2 
a= fa z (0,0) 


This theory is less trivial than it looks! Because @ is a periodic variable 8 € [0, 27), a 
so-called compact boson, the overall normalisation factor N/4r is meaningful and will 
show up in correlation functions. We will need to study such theories in some detail in 
Section 7.5, but for now a quick and simple computation of the 2-point correlators will 
suffice. If 0 was a normal scalar field in d = 1 + 1 dimensions, we would have 


(8 (x) 0(0)) = —N log (Avv |z|) (7.42) 


However, because it’s a compact boson we should really work with the single-valued 
operator e’®. The appropriate correlation function then follows from Wick’s theorem, 
together with the result (7.42), 


1 
= |x| 


(e18) e~#()) — @f(8()0(0))) 
We see that in the strict N — oo limit, the theory exhibits the long range order 
expected from spontaneous symmetry breaking. Indeed, there is a loophole in the 
Mermin-Wagner theorem and it breaks down in theories with an infinite number of 
fields. However, for any large, but finite N, we find “quasi-long range order”, with 
correlation functions dropping off very slowly. 


This translates directly into correlation functions between fermion bilinears. Using 
(7.40), we have again see the phenomenon of quasi-long range order, 
_ 7 i 1 
a3 3 aa 
(A= AH) FA +7)WO) ~ Hs 


The upshot is that, once again, an interacting quantum field theory (7.38) has found 
a way to generate a mass. This time, the fermions get mass but the chiral U(1), 
symmetry remains unbroken. 
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7.4.4 Back to Basics: Quantising Fermions in 2d 


Given that we’ve just used path integral techniques to solve a theory of strongly inter- 
acting fermions, what we’re about to do next may seem a little odd. We will return to 
the free fermion and solve it using canonical quantisation. 


This is the kind of calculation that we did in our first course in Quantum Field 
Theory, and you may reasonably wonder why we’re bothering to do it again now that 
we’re grown up. The reason is that it will prove an important warm-up for the following 
section where we discuss bosonization. 


We introduced the action for a massless fermion in (7.31). A two-component Dirac 
fermion can be decomposed into Weyl fermions YT = (w+, Y), in terms of which the 
action is 


g= I Pol AE N E EE 


The two Weyl fermions Y+ are independent. This means that there are two conserved 
quantities: these are the vector and axial currents and will be particularly important 
in what follows. The vector current is 


it = dy" (7.43) 
while a massless fermion also has a conserved axial current given by 
jn = bY (7.44) 
From these we can construct two conserved charges, Qy and Qa. 


The Weyl fermion w_ is right-moving, and quantisation of this field will lead to 
particles with momentum p > 0. Similarly, the quantisation of y, will lead to particles 
with momentum p < 0. The mode expansion of the operators in the Schrodinger picture 
follows the familiar story described in the lectures on Quantum Field Theory 


œ gq 
p_(x) = | = Za T (7.45) 
0 
g dp ipx t —ipx 
p(z) = = (be Feige ) (7.46) 


with the creation and annihilation operators obeying the standard anti-commutation 


relations {btp bha} = TR = Ir Io q). The vacuum is defined by b+p|0) = 


C4,|0) = 0, and the operators bl. and c}, then create particles and anti-particles 


p 
respectively. 
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It will turn out that we will need to be careful about various UV issues. For this 
reason, we work instead with the mode expansion 


a dp ipx —ipx — 
w_(x) -f On (b-pe” +e pe? Je Bee 
0 
dp ipx —ipx — 
p+ (z) =) on (bea pey” Je se 
where A is a UV cut-of scale. In what follows, all integrals will be over the full range 


of R unless otherwise stated. We also introduce the UV length scale 


ES — 


A 
We can then compute the two-point functions in position space. For example, we have 


~ dpd ae 
(_(aywt(y)y = | 2 (bbl p) cig irye (tna 24 
o (27) 
= 1 © dp sipe- e-lpl/^ 
0 2 
i 1 
~ On (ey) + te (7.47) 


You can also check that (yt (x)w_(y)) = (W_(x)W!(y)). In particular, if we combine 
these results we have 


el i 1 | 1 
(W-O0 WN == (Gaeta 


= Gore — d6(4—y)ase>0 
in agreement with the standard anti-commutation relations between fermions. Simi- 
larly, 
n i 1 
(WLW) = -Ep (7.48) 


and (4(2)0+(y)) = (V+ (a)b4-(y)). 
The expressions (7.47) and (7.48) are the key bits of information that we need to 


take forward into the next section where we discuss bosonization. 
7.5 Bosonization in Two Dimensions 


There is something rather wonderful about fermions in two dimensions: they can be 
rewritten in terms of bosons! The purpose of this section is to explain how on earth 
this is possible. 
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At first sight, this is a surprise. After all, the difference between bosons and fermions 
is one of the most fundamental things we learn as undergraduates. However, there are 
reasons to suspect this difference is not so stark in d = 1+1 dimensions. First, the spin 
statistics theorem tells us that bosons have integer spin and fermions half-integer spin. 
Yet in one spatial dimension there is no meaning to rotation and, correspondingly, no 
meaning to spin. Relatedly, if we want to exchange two particles on a line, we can only 
do so by moving them past each other. This is in contrast to higher dimensions where 
particle positions can be exchanged, while keeping them separated by arbitrarily large 
distances. This simple observation suggests that interactions will be as important as 
statistics when particles are confined to live on a line. 


To begin, we will show that a free massless Dirac fermion in d = 1 + 1 is equivalent 
to a free massless, real scalar field ¢. Even for free fields, this is a rather remarkable 
claim. The Hilbert space of a single bosonic oscillator looks nothing the Hilbert space 
of a single fermionic oscillator, yet we claim that the theories in d = 1+1 not only have 
the same Hilbert space (at least after we include a subtle Zə issue), but also the same 
spectrum. Furthermore, for any operator that we can construct out of fermions, there 
is a corresponding operator made from bosons. Here we will focus on these operators 
and show that the correlation functions of the fermionic theory coincide with those of 
the bosonic theory. 


The Compact Boson 


The bosonic theory that we will focus on is deceptively simple. It is the theory of a 
massless, real scalar field ¢. We write its action as 
B? 


gs f Pe (0,6)? (7.49) 


However, there is one difference with a usual scalar field: we will take our scalar ¢ to 
be periodic, taking values in the range 


$ € [0, 27) (7.50) 


We refer to this as a compact boson. The dimensionless parameter ( is called the 
radius of the boson. (String theorists would usually define R? = 27671? and call R the 
radius. Here /, is the string length and which gives R dimension -1. Furthermore, it’s 
not uncommon to work in conventions with /? = 2, in which case R? = 476.) 


Usually, the overall coefficient of the kinetic term does not affect the physics, since 
it can always be absorbed into a redefinition of the field. But, in the present context, 
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we can’t absorb 8 without changing the periodicity of ø. This leads us to suspect that 
the simple action (7.49) describes a different theory for each choice of 3”, a suspicion 
that we will confirm below. We will see that there is one special choice of 8? for which 
the compact boson coincides with the free fermion. (Spoiler: it’s 6? = 1/47.) 


What are the implications of having a compact boson? The first thing to notice 
is that we can’t add terms like ø? or ¢* to the action, since these don’t respect the 
periodicity. Instead we should add terms like cos ¢ and sin @. Equivalently, the field ¢ is 
not really a well defined operator. We should instead focus on operators like et% which, 
again, respect the periodicity. These are sometimes referred to as vertex operators, 
following their role in String Theory. Our task below will be to compute correlation 
functions of the vertex operators e’. 


Now let’s turn to the conserved currents of the theory (7.49). The action is invariant 
under the symmetry ¢— @+ constant. The associated current is 


Senise = Ba" 


Clearly the equation of motion, 0?¢ = 0, ensures that j/,,, is conserved. The corre- 
sponding Noether charge is Qsnit, under which the operator et? has charge +1. 


However, in two dimensions a massless scalar also enjoys another conserved current, 


1 
“be ne ev a 
Jwind On Q 
which is conserved by dint of the epsilon symbol; we don’t need to invoke the equation 
of motion. To see the associated conserved quantity, it is useful to put the theory on a 
spatial circle of radius R. The charge associated to 7". is then 


27R 1 27R 
Quma = [de Bma= 5 fae 86 
0 T Jo 


The conserved charge Qwina is the number of times that ¢ € [0,27) winds around its 
range as we go around the spatial circle. It is a topological charge. The existence of two, 
independent U(1) global symmetries is reminiscent of the vector and axial symmetries 
of the massless fermion. We’ll make this connection more precise shortly. 


7.5.1 T-Duality 


There is an alternative description of the compact boson in terms of a dual scalar. To 
realise this, we take the original action (7.49) and think of 0¢ as the variable, rather 
than ¢. We can do this, only if we also impose an appropriate Bianchi identity. We 
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might naively think that the Bianchi identity is 0,,(e“”0,@) = 0, but in fact this is too 
strong since it kills all winding. Instead, we want 


1 2 7 1 ü 
= v = — Z ol 
- fa x O l Op) = fd Oud € (7.51) 

To impose this, we introduce a second compact boson 
$ € [0, 27) 

and consider the action 

1 is. , 
g= / d’x 58° (Oub)” + zr Ob vd 


Integrating out ¢ in the partition function imposes the condition (7.51) and takes us 
back to the original action (7.49). Alternatively, we can integrate out 0¢. Completing 
the square, we have 


1 il Ao A 
= 2m | R2 HA — pv Do a 2 2 
5 fa z£ 58 (ə @ ape ad) + 3 ee (0?) (7.52) 
This then gives an equivalent theory in terms of the dual scalar, 
ae n 1 
2 2 2 . 2 


The theory (7.53) is entirely equivalent to our original theory (7.49). This is referred 
to as T-duality. 


T-duality is particularly striking in the context of string theory. There, the compact 
boson ¢ is interpreted as a compact direction of spacetime in which the string can 
move. In the usual conventions of string theory, the radius of this circle is taken to 
be R = vV2r6l, with l, the string length. T-duality says that, as far as the string 
is concerned, the physics is exactly the same if we instead take a spacetime with a 
compact circle of radius R = 12/R. In other words, very big circles are the same as 
very small circles. You can read more about this interpretation in the lecture notes on 


String Theory. 


How is this possible? The key is the relation between ¢ and ¢, which can be found 
inside the squared brackets in (7.52), 
1 


Ong = gpa? (7.54) 


This clearly relates the momentum current for ¢ to the winding current for ¢, and vice 
versa. What looks like momentum modes in one description becomes winding modes 
in the other. In particular, e’® carries charge +1 under Qwina- 
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Although the transformation between ¢ and ¢ is simple, it is also non-local. If we try 
to solve for ¢ in terms of ¢, we must integrate. We’ll see this clearly in (7.56) below. 


Chiral Bosons 


In what follows, it will be useful to introduce chiral bosons, which are either purely left 
moving or purely right moving. The equation of motion 0?¢ = 0 can be solved by 


b= o-(x") + b4(2") 


where «+ = t +x. In fact, the decomposition isn’t quite as clean because there is also 
a zero mode which does not naturally divide between the two. We will ignore this fact 


here. 


These chiral bosons give us a novel perspective on the dual scalar. The relation 
(7.54) is solved by writing 


$ = 2nB?(b_ — $4) 


We can then express the chiral bosons in terms of the scalar and its dual by 


1 1 ~ 
Indeed, we can check that 
ozo- = PD as Da = ô$ = t$ — On aF oQ m —ó;p_ = o,o- = () 


as required. 


7.5.2 Canonical Quantisation of the Boson 


Let’s now consider what happens when we quantise the boson. Let’s start by ignoring 
the the fact that @ is compact: we’ll then reinstate this condition later when we discuss 
the viable operators in the theory. In the Schrödinger picture, we expand the operator 
o(x) in Fourier modes, following the usual story in Quantum Field Theory 


1 dp 1 ipx —ipr\) „—|p|/2A 
=a oe gag oe + ale iP) eel 


Classically, the momentum is 7 = 62¢. In the Schrödinger picture, this is written as 


= ip JZ Pel (ay — ate-iP) e7lel/2A 
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the operator 


We’ve introduced a UV cut-off A in these expressions. We’ll see the utility of this 
shortly. As for fermions, we also introduce the UV length scale « = 2/A. Using the 
usual commutation relations among the creation and annihilation operators [ap, at] = 
2r ô(p — q), we have 
a € l 
[o(x), r) = — TT ið(x — y) ase 0 


How do we construct the quantum operator for the chiral boson (7.55)? The dual 
scalar obeys 0,¢ = —ġ = —7/8?. We can then write down a quantum operator in the 
Schrödinger picture, by integrating the momentum thus: 


AWe: (2 saf w r(2)| (7.56) 


Here we see what we promised earlier: the chiral bosons +(x) are inherently non-local 


objects: they requires knowledge of the profile of the field everywhere to the left of the 
point x. To check that these are indeed the right objects, we can work in the our mode 
expansion. We have 


- 2l) = 2A 
‘ie TE- (a,c + ale) e-t 
(z) 26 am ( 
= al = f (ae e”? + ale me —p/2A 


which picks up contributions only from the right-moving, p > 0 modes. This is remi- 


niscent of the expansion (7.45) for the Weyl fermion ~_. Similarly, 


(£) E f (ape e? + ate —ipx le 


which picks up contributions only from left-moving, p < 0 modes. This is reminiscent 
of the expansion (7.46) for Y4. 


The commutation relations of ¢+ are easily computed. We have 


uaes “ep i : dy lole) W]e f i E 
P —sign(x — y) (7.57) 
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Again, we see the non-locality of chiral bosons in their commutation relation. The 
operators fail to commute no matter how far separated. Meanwhile, 
i 
Ro = ——_ 7.58 
We) -0 = Ze (7.58) 
This latter commutation relation is telling us that, in contrast to the Weyl fermions, 
the left and right moving scalars have not fully decoupled. The culprit is the zero 
momentum mode of the scalar, which is shared by both 4} and @_. This zero mode 
is an important subtlety in a number of applications, but we will not treat it properly 
here. A slightly better treatment can be found in the lectures on String Theory. 


Before we proceed, we need one more computation under our belts. This is the 
Green’s functions for the chiral bosons (ġ+(x)¢+(y))}. This is straightforward. To 
avoid UV divergences, we first subtract the constant term and define 


G(x, y) = (ġa (2) ly)) — (¢2(0)*) 


We then have 


1 “dpdq 2 Ae 
G_ = + ipz—iqy _ 1) o—(pta)/2A 
(x,y) a ony mo (e Je 
4 ~ dp 2 ip(x—y) - 
— a + L (Pt —] p/N 
482 f On p (e Je 


1 i € 
InP (5) 


Note that G_(x, x) = 0, as it should. Meanwhile, at large distances the Green’s function 
exhibits a logarithmic divergence. This infra-red behaviour is characteristic of massless 


scalar fields in two dimensions. Similarly, we have 


1 € 
G = l 
eD = Gabe (r) 
The Correlators 


Finally, we have the tools to compute correlation functions in this theory. But the 
question that we should first ask is: what are the operators? The first point to note 
is that ģ is not a good operator, because the classical field is not single valued. The 
same is true of the dual é. Instead, we must work with derivatives such as O¢ or with 
so-called vertex operators of the form 


ev are? 
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where, as usual, normal ordering means all annihilation operators are moved to the 
right. Whenever we write an operator like et% or cos ¢, we will always mean that normal 
ordered version of these operators. In subsequent equations, we will keep punctuation 
to a minimum and usually won’t explicitly write the : :. 

In what follows, we will compute correlation functions of the form 


(c?-@e9-W) and (ette eiO 


In the next section we will then compare these with expressions involving fermions. 


At the same time, we will look a little more closely at the conditions for e’?+ to be 
consistent with the periodicity of @. 


To compute these expressions, we need to think more carefully about what the normal 
ordering means. For this, we will need the usual BCH identity, 


uy 
e^eB = e^tBot3lAB] — eP APIA BI 


where the higher order terms vanish whenever [A, B] is a constant. We apply this to 
the operators A = qa + a'at and B = ba + 8'at. We have 
sef ss eB: = ert enteh d epa 
= eva e't et ebt oop! 


= : GATE ee?) 


Applying this to the vertex operators etf, which are nothing more than exponentials of 
many creation and annihilation operators, we have 


(e?-@) eit- 0) = let? (x)—iġ W) eC- (27) 


But the correlation function on the right-hand side is of a normal ordered operator and 
this is simply (: e'@-)-"@-™ :} = 1, since only the 1 in the Taylor expansion of the 
exponential contributes. We’re left with 


g 1/406? 
ib_(x) ,—id_(y)\ — pG- (x,y) — Sos 7.59 
(e e )=e (=) (7.59) 
Similarly 
1/406? 
(cid+ @e-id+(y)) — Cre) — (__§& (7.60) 
e+i(x—y) 


Note that the correlation functions depend in an interesting way on the radius of the 
compact boson 67. This confirms a statement that we made at the beginning of this 
section: the radius of the boson 6? is a genuine parameter of the theory. In the language 
of conformal field theory, we would say that the operator e+ has dimension 1/876?. 
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7.5.3 The Bosonization Dictionary 

The hard work is now behind us. Looking at the correlation functions (7.59) and (7.60), 
it is clear that they take a particularly simple form if we choose the radius of the boson 
to be 


aan 
P= ir 


We can then compare correlation functions for right-moving fermions (7.47) and bosons 
(7.59), 


a 1 


d (cif ete) = 
2r (x — y) + te eee: Se 


(p_(x)bt (y)) = 


This tells us that we should identify 


or Von oie (@) (7.61) 


where, recall, A = 1/e is our UV cut-off. Similarly, comparing the correlation functions 
for left-moving operators, we have the map 


1 —i g 


We can also develop the map between composite operators. The simplest is the 
quadratic, mass term for fermions 


py = pi (a)b+(a) + yi (@)y_(2) — = (ettei) + eib+(eid-(2)) 


At this point, we just need to use the standard BCH identity, efe? = e4+Pe4.F)/2, 


Using the commutation relation (7.58), we have 


_ Í À l 1 
py e —— (e 4 et) = —— coso (7.63) 
QT € TE 


Similarly, the chiral mass term 
7 3 1. 
ipy yp > -—sing 
TE 


These will be important in the next section when we will understand better how to 
think of massive fermions in the bosonic language. 


— 370 = 


Matching Currents 


Bosonization is a kind of duality, in which two seemingly different theories secretly 
describe the same physics. In any such duality, the most important objects to match 
on both sides are the conserved currents. We will see how this pans out in the present 


context. 


The vector (7.43) and axial (7.44) currents are, like the mass term, composite, 


quadratic operators. For example, 
jy = -4y = -viy + yi) and jy = yty = yi + vhs 


However, it turns out that we need to be a little more careful in defining these operators. 
We do this through point splitting. For example, consider 


wip = lim o!(x)d_(y) 


you 2TE 
= lim 1 -i(6_(@)+6-(w)) ¢G- (ew) 
you 2TE 
1 ; Og_(z) € 
= lim 5 |1-te- Fo. 
s = i(x — y) Əz a a, 


Note that this expression comes with an infinite, constant term. We can remove this 
simply by normal ordering the fermionic operator. Identical calculations also hold for 
why, leaving us with the map 


1 06. 
2m Ox 


whe 


From this we can read off the map between currents, 


D, = 1o(-+o+)_ _ 1 


Jv 2T Ox On Ox 
and 
+, Jag- 11 
ae 2T Ox 7 on pe) 


Recalling that the classical momentum is 7 = 6?¢, we identify j <> @/2z. In other 
words, we learn that the vector current of fermions is related to the topological current 


=o l= 


in the bosonic language 
1, 
iv > —Frind = =a OVP (7.64) 
Similarly, 
ja + =j = BO" (7.65) 


The methods that we’ve described above can be used to find the map between all other 
operators in the theory. For our purposes, the basic dictionary described above will 
suffice. 


7.5.4 The Allowed Operators: Is the Boson Really a Fermion? 


We have seen that, when 6? = 1/47, the operators e’®+ can be identified with free 
fermions through the map (7.61) and (7.62). But there is one subtlety that we didn’t 
address: are the operators e’?+ compatible with the periodicity of #6? 


Because ¢ € [0,27), the operator ef? is perfectly fine, as indeed is e”? for any n € Z. 
The dual scalar, defined by (7.54), also has periodicity be [(0, 27), so that et? is also 
fine. In general, we can have any operator of the form eindtivd with n,w E€ Z. For a 
general value of 37, this means that the allowed operators are 


eindtiwd = eilnt+2nB?w)d— eiln—2n Bw) b+ 


Restricting to 6? = 1/47, we have 


eingtiwd — el(n+w/2)b— elln—w/2)b+ 


To get a purely chiral operator we could, for example, set n = 1 and w = +2. But this 


leaves us with e?'+, rather than e’?+. This is rather disconcerting, since it means that 


the operators e'®+ are not in the spectrum of the theory because they are incompatible 
with the periodicity of @ and @. Yet these are precisely the operators that we want to 
identify with a single fermion. What’s going on?! 


The answer is that the compact boson is not actually equivalent to a theory of a 
free fermion. Instead, it is equivalent to a theory of a fermion coupled to a Z gauge 
symmetry, acting as 


Zo : Y = —v (7.66) 


This eliminates the single fermion from the spectrum, but leaves us with the composite 
operators Ww and wy). 


14m grateful to Carl Turner for explaining this to me. 
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The need to couple the free fermion to a Z gauge field shows up in another way 
which we briefly describe here. If the two theories are equivalent, then their partition 
functions should coincide. It is straightforward to compute the partition function for 
the compact boson on a torus T?. It agrees with that of a free fermion only if we 
sum over both periodic and anti-periodic boundary conditions on the torus. (These 
are usually referred to as Ramond and Neveu-Schwarz sectors respectively.) The fact 
that we need to sum over both boundary conditions is another way of saying that the 
fermion is coupled to a Zə gauge field, ensuring that configurations related by (7.66) 
are physically identified. 


7.5.5 Massive Thirring = Sine-Gordon 


Having spent all this time developing the bosonization dictionary, we can now use it in 
anger. As we will see, the nice thing about the bosonization map is that it very often 
takes a strongly coupled theory and rewrites it in terms of a weakly coupled theory 
using the other variables. 


Let’s go back to the free theory of a compact scalar, 
2 
s= fdr Se) 


We know that for the specific value 8? = 1/47, this is equivalent to a free, massless 
Dirac fermion. But what about the other values of 82? This is easy to answer using 
our bosonization dictionary. We split the kinetic term up as 


B? 1 g 


~ An | Qn? 


and think of the second piece, proportional to g, as a bosonic current-current interac- 
tion, 


1 
Sed Jwind u = 4r? (3L) 


Adding such a current is straightforward for the boson: it just shifts the coefficient of 
the kinetic term away from the magic value. Written in terms of the fermion, it must 
again be a current-current interaction, this time of the form 


ive = (aby ep) (yy) 


This is referred to as a Thirring interaction. Rather surprisingly, we learn that a 
general, free compact boson corresponds to an interacting fermion, 
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More generally, we can consider the massive, interacting Thirring model, with action 


oe f Bx ih ib — mip -gor (7.67) 


Bosonization maps this into a compact boson with a potential, known as the Sine- 
Gordon model, 


S= [és Ba p)? + — coso 
Joe TE 


Note that the action include an explicit mention of the UV cut-off A = 1/e. The 
potential V(¢) ~ —cos¢@ has its minimum at @ = 0 and so, indeed, would seem to give 
a mass to @ as required. 


There are a couple of cute subtleties that we learn from the bosonization map. First, 
we usually think about adding interaction terms to the Hamiltonian which are positive 
definite. For our fermionic theory, the requirement is slightly different. We must have 
8? > 0 on the bosonic side but, in terms of fermions, this translates to 


> T 
I? 


We learn that we can suffer a negative contribution to the Hamiltonian, as long as it’s 
not too negative. 


Second, we expect that the role of m is to make the excitation massive on both sides. 
But that’s not quite true. Recall that the two-point correlators (7.59) and (7.60) allow 
us to read of the dimension of the vertex operators e’?+ or, equivalently, the dimension 


of the fermion. This dimension is 1/878”. It means that the cos¢ potential for the 
boson (or, equivalently, the mass term for the fermion) is relevant only if 


1 T 
oI Za = 
mo => x > = => g> i 


In other words, for —7/2 < g < —7/4, the mass term is an irrelevant operator and the 
massive Thirring model describes a massless theory in the infra-red! 
Fermion = Kink 


It will pay to look a little more closely at what becomes of a single, massive fermion. 
The answer to this follows from looking at the map between currents (7.64). A single 
fermion carries charge Qy = f dx j} = 1. Correspondingly, it corresponds to a state 
in the bosonic theory with charge 


Qwind = = fe Ord = -1 
2T 
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It is straightforward to find a classical configuration with that carries this charge. The 
minima of the potential V(¢) ~ —cos@¢ lie at ¢ = 2mn. We simply need to take a 
a configuration that interpolates between two minima, say from ¢@ = 27 at x — —oo 
to ¢ = 0 at x + +00. We learn that the fermion is identified with a kink in the 
Sine-Gordon model. 


We can explore this kink in more detail. The classical energy of any configuration in 
the Sine-Gordon model can be written, up to an unimportant constant, as 


= fe a -+ m sin?’ (¢/2) 


T 


We can rewrite this using the Bogomolnyi trick, in which we complete the square thus: 


2 2 2 
E= fo - (s + ae sn(¢/2)) + 4/ mee sin(¢/2) (7.68) 


The first term is a total square, and hence positive definite. The second term is a total 


derivative. This ensures that we can bound the energy of any configuration in terms 
of the end points 
mp? 


TE 


E>4 


+00 
—oo 


[cosc] 


For a kink that interpolates between neighbouring minima, we have 


mB? 


TE 


Ekink Z 8 


with equality if the Bogomolnyi equations are satisfied, which can be found in the total 
square in (7.68), 


4m 


p= Bre 


sin($/2) 


These equations aren’t quite satisfactory, since they still include the UV cut-off e. 
This arises here because we’re using an unholy combination of classical and quantum 
analysis. Still, there’s a simple way to fix it. For g = 0 or, equivalently, 6? = 1/47, the 
Sine-Gordon model describes a free fermion. Here, the mass of the Bogomolnyi kink is 


4 | 
Ekini A = (7.69) 
T iG 


which suggests that we should take the e = 16/mz? = mx?/16 if we want the semi- 
classical analysis of the Sine-Gordon model to reproduce the mass m of the fermion. 
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There is a more general lesson lurking here. Bosonization provides us with a duality 
between two different theories, in which the elementary excitation of one theory is 
mapped into a soliton of the other. This, it turns out, is a characteristic signature of 
dualities in different dimensions. (We will meet an example in 3d where particles are 
mapped to vortices in Section 8.2.) Often these other dualities are not well understood. 
Two dimensional bosonization provides a useful grounding, where the map between the 
two theories can be performed explicitly. 


7.5.6 QED,: The Schwinger Model 


The Schwinger model is the name given to QED in two dimensions: it consists of a 
single Dirac fermion, coupled to a U(1) gauge field. The action is 


1 0 = gee 
S= je sean 5 Fon | ipDw — imap 


As we have seen in Sections 7.1 and 7.2, Maxwell theory is strongly coupled in two 


dimensions, and electric charges confine. When the fermion is very heavy, m? > e?, 
we can use standard perturbative techniques to solve the model. In contrast, when the 
fermions are light the theory is strongly coupled and we must look elsewhere for help. 
Fortunately, as we now see, bosonization will do the job for us. 


The coupling between the fermion and the gauge field is buried in the covariant 
derivative: Dy = y —iA, yty. As usual, the gauge field couples to the fermion 
current, as A,,jj-. This makes it oe to write down the bosonised version, 


1 0 1 
S= fex zaro + 27 — Fo T = = + zy ue Ob + T cos b 


m f La Fat bOn — (0,8) + cos (7.70) 


where the second line follows after an eee by parts. Already here, there’s 
something rather nice. Suppose that the mass m = 0. The equation of motion for @ is 


then 


1 
E N 
mo a 
But we know from our bosonization formula (7.65) that the axial current is jį = 


—0"o/2n, so we can write this a 
ul 
Onda = — Fo 


But this agrees with our earlier derivation (3.36) of the anomaly in two dimensions. 
Previously the anomaly was a subtle quantum effect; after bosonization, it simply 
becomes the equation of motion. 
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Meanwhile, the equation of motion for the gauge field includes 
1 1 e? 
`F ee. [ 

Or (5 o1 + =) 0 => Fo >? 


where the second condition comes from requiring that this combination vanishes at 
infinity. This is reminiscent of our result in Section 7.1 where we found that the theta 
angle gives rise to a background magnetic field (7.7). However, once again, we find this 
result simply from the classical equation of motion, without the need to invoke any 
quantisation. A more careful analysis, along the lines of Section 7.1 shows that 


e? 
Fo, = =z + ¢) 


which seems very reasonable given the action (7.70). (Note: in Section 7.1, we denoted 
the Wilson line as ¢; this is not to be confused with the bosonized fermion ¢ we are 
working with here.) 


To answer further questions, note that the gauge field A, only appears in the field 
strength in (7.70). If we take the theory to sit on a line, so that there is no quantisation 
condition on Foi, we can integrate out the gauge field to get 


1 m e? 
S= | xr —(6,¢) + — — — (0 5 
/ ea, n9) Pa 377. +e) 
Note that we have now lost the periodicity in ¢. (This is restored on a compact space 
where f Fo, € 27Z. In this case, the potential gets replaced by min, (0+ +2rn)?. We 
encountered similar periodic, but non-smooth potentials in our study of 4d Yang-Mills 
theory at large N in (6.18).) 


There are a number of things we can now look at. First, suppose that our original 
fermions were massless, with m = 0. Note that we can now absorb the theta angle 
simply by rescaling ¢ — ¢— 0. This is to be expected: as discussed in Section 3.3.3, 
the chiral anomaly means that the theta angle is always redundant in the presence of 
massless fermions. We’re left simply with a real scalar field whose mass is 


2 e? 
mass = — 
T 


We learn that the massless Schwinger model is not, in fact, massless. It has a gap. 
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Let’s now turn on the fermion mass m. The minima of the potential now sit at 


ee 


(0+ 9) (771) 


sin @ = ee 
For large m, this has many solutions but, at least when 6 Æ 7, there is only a unique 
ground state. There is now no kink solution that interpolates between neighbouring 
minima because the minima are no longer degenerate. This reflects the physics of 
confinement that we saw in Section 7.1: a single fermion costs infinite energy due the 
resulting flux tube which stretches to infinity. The finite energy excitations are mesons, 
bound states of fermions and anti-fermion. One may use the bosonized action above 
to study these in the limit of small mass. 


Something interesting happens when 6 = m. This is simplest to see if we shift 
p =¢ġ-— r. The minima of (7.71) then sit at 


ee 


ĝ a 


sin @ = a 
This can be solved graphically. When m > e?e, there are many solutions. The obvious 
one at œ = 0 is actually a local maxima of the potential. There are then two degenerate 
minima. This is what we expect from our discussion in 7.1: integrating out the very 
heavy fermion leaves us with pure Maxwell theory at 0 = 7, and we know that this has 
two degenerate ground states. 


Now we can decrease the mass. The number of solutions to (7.72) starts to decrease 
and for m < e?e, we have just a single ground state at b = 0. The critical point 
happens at 47m = e?e, when the two degenerate minima merge into a single one. But 
this is a very familiar phase transition: it is described by the Ising critical point. We 
learn that as we vary the mass at 0 = 7, the Schwinger model becomes gapless and is 
described by the 2d Ising CFT. Note that this is exactly the same behaviour that we 
saw for the Abelian Higgs model in Section 7.2. 


7.6 Non-Abelian Bosonization 


Consider N, massless Dirac fermions, y; with i = 1,..., N. Decomposing each into a 
Weyl fermion, the action is 


S= f Pa iy! ð pi + i Ody: (7.73) 


where ô+ = 0;40,. We clearly have a U (N) x U(N) chiral symmetry, which rotates the 
left- and right-handed fermion separately. In fact, in two dimensions each Weyl fermion 
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can be further split into two Majorana-Weyl fermions. This follows from the fact that 
we can choose a basis of gamma matrices (7.30) that are both in the chiral basis and 
real. The upshot is that the free fermions (7.73) actually have an O(2N) x O(2N) 
chiral symmetry. 


But what becomes of this symmetry on the bosonic side? We have N compact, 
real bosons ¢;. Because these are compact, there is not even an O(N) symmetry that 
rotates them. (This is the statement that RY has a O(N) symmetry acting on it, but 
the torus T does not.) Instead, all we have is the Cartan subalgebra U(1)%, together 
with the corresponding action on the dual scalars. 


What to make of this? One might think that it’s no biggie: after all, the bosonic the- 
ory should presumably have the enlarged symmetry since its equivalent to its fermionic 
cousin. But it would be nice to make this manifest. And, fortunately, there is a 
beautiful way to do so, as first explained by Witten. 


Here we will bosonize, keeping the U(N) x U(.N) symmetry manifest, although a 
similar method works for the O(2N) x O(2N) chiral symmetry too. Let’s start by 
looking at the currents. The overall U(1) x U(1) takes a similar form to the previous 
section, but we write this as 


j-= 2b bs and jy = Why 


These are the components of the vector and axial current written in the lightcone 


coordinates c+ = t +g. But now we also have the non-Abelian flavour symmetries, 
with the corresponding SU(N) currents, 


Jt = 2! Tap; and Jt = Wh Ted, 


where T¥ are the generators of su( N). The equations of motion for the fermions ensure 
that the currents obey 


oj- = oj =0 and o J? = oJ = 0 
We would like to ask: can we write down a bosonic model that has the same currents? 
Rather than jumping immediately to the model, we’re first going to write down an 


ansatz for the form of the currents, and then see if we can come up with an action 
which reproduces this. 
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We’ve already seen how to do this for the U(1) currents: we simply write them in 
terms of a compact boson @¢. In lightcone coordinates, this becomes 


1 l 1 
J- = z -9 and j} = -5—00 
1 2T 


What’s the analog expression for the non-Abelian currents? Here’s a guess. First 
let’s write the Abelian currents in a way that highlights their U(1)-ness. We define 
g = e € U(1). Then we can write 


i 
2T 


710g and jy = 91049 (7.74) 


m= Qn 


This is now something that we can hope to generalise. We introduce the group-valued 
field 


g(x,t) € SU(N) 
We then define the currents 


i i = 
Jo = om '@.g and Jy= 5, O+9)9 : (7.75) 


1 


Note that the ordering of g and g~* matters in these expressions and differs from what 


we might naively have written down simply by copying (7.74). The reason for the 
choice above is that we want these currents to obey conservation laws 


ðJ E Oud, == 0 (7.76) 


Happily, the ordering in (7.75) means that the first of these conservation laws implies 
the second, 


oJ = 0 => (0,9 ')O_g + g 100_9 = 0 
g(04g *)O-g + 0,0_g = 0 
O,g(0_g *)g+0,0-g = 0 

0.90.9 + (340-g)J =0 > Os, =0 (7.77) 


Had we chosen a different order of g and g™* in (7.75) then the conservation laws (7.76) 
turn out to be inconsistent with each other. 


Now we've got a good candidate for the currents (7.75), we want to write down 
an action for g whose dynamics implies their conservation. In fact, given the group 
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structure, we are pretty restricted in what we can write down. If we want an action 
with two derivatives, then there is a unique choice, 


1 — 
5= fer Da tr (3 g Og) (7.78) 


for some dimensionless coupling \?. We have met this structure before: it is identical to 
the chiral Lagrangian (5.7) that we used to describe pions in QCD. This is a non-linear 
sigma model, whose target space is the group manifold SU(N). In two-dimensions, 
the sigma-models whose target spaces are group manifolds are sometimes referred to 
as principal chiral models. 


The action (7.78) enjoys two global symmetries, in which we act by an SU(N) 
transformation on either the left or right, 


g—>Lg or gogR, L,RESU(N) 


This gives rise to two currents J ~ (O“g)g~' and JR ~ (O"g~')g. (We computed 
these currents in the context of the chiral Lagrangian in (5.11) and (5.12).) These 
indeed take the a similar form to our chiral currents J_ and J, defined in (7.75), 
which is encouraging. However, closer inspection tells us that things aren’t quite as 
straightforward. The equation of motion from (7.78) implies that „JË = ð JR = 0, 
but this not the same thing as what we wanted in (7.76). We learn that the symmetry 
structure of the bosonic model (7.78) differs from that of N free fermions. 


There is also a dynamical reason why the sigma model (7.78) cannot describe free 
fermions: it is asymptotically free. The coupling A?(j) runs with scale u and its one- 
loop beta function can be shown to be 

dd? A? 
— = —(N — 2)— 
ry ( ves 
This is similar to the behaviour of the CP’~' model that we met in Section 7.3. (It 
is even more similar to the behaviour of the O(N) models in two dimensions that we 
met in the lectures on Statistical Field Theory.) In the infra-red, the non-linear sigma 
model (7.78) is expected to flow to a gapped phase. 


7.6.1 The Wess-Zumino-Witten Term 


The simple sigma-model (7.78) does not have the right properties to describe free 
fermions. However, it is possible to modify this theory to give us what we want. The 
modification is a little subtle, but it’s a subtlety that we have met before: the extra term 
cannot be written as integral over 2d spacetime, but instead only over a 3d spacetime. 
Such terms are called Wess-Zumino- Witten terms, and we saw an example in Section 
5.5 in the context of the chiral Lagrangian for QCD. 
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Things are simplest if we work in the Euclidean path integral and take our spacetime 
to be S?. We introduce a three-dimensional ball, D, such that 0D = S2. We extend the 
fields g(x,t) over S? to g(y), where y are coordinates on the ball D. We then consider 
the modified action, 


1 
S= jez DE tr (Ou9 og!) + ef dy w (7.79) 
D 


where 


Og Og 
— _" Pr 109 -1 1 
Y= am (s Oy" 2 Oy” 7 Oy? ) 


This has a very similar structure to the five-dimensional WZW term (5.35) that we 
introduced in Section 5.5. 


Just as in the 4d story, there is an ambiguity in our choice of 3d-dimensional ball D 
with 0D = S?. We could just as well take a ball D’, also with OD’ = S? but with the 
opposite orientation. The now-familiar topological quantisation conditions tell us that 


exp (i | d°y w) = exp (-i« f dy w) => exp (i dy w) =i 
D ’ s3 


where we have stitched together the two three-balls to make the three-sphere S? = 
DUD". The integrand provides a map from S? to the group manifold SU(N) with 
fields g(y). But, as we saw in the context of instantons in Section 2.3, these maps are 
characterised by the homotopy group 


II3(SU(N))=Z for N>3 


It turns out that, for configurations with winding n, the WZW term evaluates to 
Jos dèy w = 2mn. This quantisation condition then tells us that the coefficient of the 
WZW term must be an integer. 


keZ 
We refer to this integer as the level. 


The effect of the WZW term in two dimensions is, in many ways, much more dramatic 
than that of its four dimensional counterpart. In 4d, we had to look at rather specific 
scattering processes, or baryons, to see the implications of the WZW term. In contrast, 
in 2d the presence of the WZW term affects even the phase of the theory. To see this, 
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we can look again at the beta function for \?. At one-loop, one finds that it picks up 
an extra term, given by 


We see that there is now a fixed point of the RG equation, at 
_ An 
ikl 
Here the theory is described by a gapless CFT, known as the SU(N), WZW theory. It 
is completely solvable using various CFT techniques, although we will not discuss these 


A? (7.80) 


here. Since our one-loop computation is valid for A? < 1, we can trust the existence of 
this fixed point only when k > 1 and the theory remains weakly coupled. Nonetheless, 
the fixed point is known to persist for all k € Z. 


At the fixed points, something nice happens with the currents. The classical equation 
of motion of the action (7.79) is 


1 = k _ 
50 Ou(g 0,9) — 36 ug ‘d,g) =0 


In lightcone coordinates, with metric 7,_ = 1, this reads 


1 | 1 1 z 
(= + zz) a(g 0g) 4 (5 -= zz) ð+ (973-9) 


At the fixed point (7.80), one of these terms vanishes. Which one depends on the sign 
of k. For k > 0, we’re left with 


dg Dg) =0 


which is precisely the condition 0_J, = 0 that we wanted for the chiral current (7.76). 
The other condition 0,J_ = 0 then follows automatically, as shown in (7.77). 


We've found that, for each N, there is a set of conformal field theories, labelled by 
k € Z. That’s nice but which, if any, describe N free fermions? The answer to this 
comes from looking more closely at the algebra obeyed by the SU(N) currents. We 
won't give details of the calculation here, and instead just sketch the basic facts. The 
SU(N) currents turn out to obey an extension of the usual su( N) Lie algebra, with an 
extra term referred to as a central charge, 


[FE (2), JEU) = iJa laale — y) + Eate — y) 
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with f% the structure constants of su( N) and 6’(x) the derivative of the delta function. 
This is known as a Kac-Moody algebra, and its properties are well studied. It is known 
that the algebra has unitary representations only if k € Z, a fact which sits well with 
our realisation as currents in the WZW model. 


One can also compute the same algebra for N free Dirac fermions. Here the compu- 
tation is somewhat simpler and follows from the usual commutation relations for free 
fermions. One finds the Kac-Moody algebra above, but with the specific value k = 1. 
We learn that we can bosonize N free Dirac fermions to an SU(N) WZW model at 
level k = 1, together with a compact boson ¢ to describe the U(1) currents. In other 
words, the following action 


oe r oe ee pet fë 
S= | ër = (0,0) + ggz" (0.9 "9 )+ soe 


is, despite appearances, N free Dirac fermions in disguise. 


7.7 Further Reading 


Quantum field theories in low dimensions were originally studied by particle physicists. 
They were viewed as toy models, in which some of the more outlandish behaviour of 
quantum field theory, such as confinement, or a dynamically generated mass, could 
be viewed in a tractable setting, giving comfort in a time of confusion. Later it was 
realised that many of these quantum field theories have direct application to condensed 
matter systems. 


This programme was initiated by Schwinger who, in 1962, studied massless QED 
in d = 1 + 1 [174], in what is probably the first time that a strongly interacting 
quantum field theory was solved. This is a model which trivially confines and, somewhat 
less trivially, exhibits a mass gap. In these lectures, we solved it using bosonization 
techniques. Schwinger used operator methods. One conclusion that he took from this 
study was that thinking in terms of elementary particles can be misleading in strongly 
interacting field theories: 


“This line of thought emphasizes that the question “Which particles are 
fundamental?” is incorrectly formulated. One should ask “What are the 
fundamental fields?” .” 


The massive Schwinger model was revisited by Coleman and collaborators in the 1970s 
to better understand both confinement and the role played by the theta angle in two 
dimensions [28, 30]. The full phase structure of the theory, including the critical point 
at 0 = 7, was described in [179]. 
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Gross and Neveu introduced their models of N interacting fermions in 1974 [85]. 
Their goal was to test drive an asymptotically free theory which exhibits a dynamically 
generated mass scale as well as, in this case, dynamical spontaneous symmetry breaking. 
Witten later determined the spectrum of kinks [218] and showed how to reconcile the 
apparent breaking of the U(1) chiral symmetry [219] with the lack of Goldstone bosons 
in two dimensions in [134, 26]. 


The role of instantons in determining the phase structure of the two-dimensional 
Abelian-Higgs model was first discussed by Callan, Dashen and Gross in [24]. One 
might have thought that this was a warm-up to understanding the vacuum structure 
of four-dimensional gauge theories, but in fact it was a warm-down to check that their 
earlier 4d analysis was sensible. The full phase diagram, including the critical point at 
0 = 7, was described in the appendix of Witten’s CP™ paper [220]. A more modern 
perspective on this critical point was discussed in [123]. 


The CP™ model was proposed in 1978 [50, 81]. It was quickly noticed that it shares 
a number of properties with Yang-Mills, including asymptotic freedom, instantons and 
a large N expansion. It was first solved at large N by D’Adda, Liischer and Di Vecchia 
[36]. Soon after, Witten studied the interplay between instantons, the theta term and 
the large N expansion, and argued that this provided a useful analogy for Yang-Mills 
in four dimensions [220]. The fact that the CP! model at 0 = 7 is a gapless theory 
was first conjectured by Haldane in [87] 


In the high energy literature, bosonization was introduced by Sidney Coleman [29]. In 
the condensed matter literature, related results were derived slightly earlier by Luther 
and Peschel [128], and also by Mattis. Coleman ends his paper with the typically 
charming admission “Schroer has also pointed out that many of the results obtained 
here are in close correspondence with the results of [...] Luther and collaborators. 
Luther and I are in total agreement with Schroer on this point; we are also united in 
our embarrassment that we were incapable to reaching this conclusion unprompted. 
(Our offices are on the same corridor.)” The non-local relationship between fermions 
and bosons was discovered soon after by Mandelstam [131]. An earlier, lattice version 
of this relationship can be found in the Jordan-Wigner transformation. Finally, the 
non-Abelian bosonization is due to Witten in the beautiful paper [227]. 


There are a number of excellent reviews on bosonization, including [177, 178] 


These lectures notes do not discuss conformal field theories in d = 1 + 1 dimensions. 
This is a vast topic that deserves its own course. An introduction to the very basics 
can be found in the lectures on string theory [192]; an introduction to more than the 
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basics can be found in the lectures by Ginsparg [74]; and a fuller treatment can be 
found in the big yellow book [40]. 
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8. Quantum Field Theory on the Plane 


In this section, we step up a dimension. We will discuss quantum field theories in 
d = 2+ 1 dimensions. Like their d = 1 + 1 dimensional counterparts, these theories 
have application in various condensed matter systems. However, they also give us 
further insight into the kinds of phases that can arise in quantum field theory. 


8.1 Electromagnetism in Three Dimensions 


We start with Maxwell theory in d = 2 = 1. The gauge field is A,, with u = 0,1, 2. 
The corresponding field strength describes a single magnetic field B = Fi2, and two 
electric fields E; = Fo;. We work with the usual action, 


1 
S Maxwell = fë — Ew p Dg Anj” (8.1) 
e 


The gauge coupling has dimension [e?] = 1. This is important. It means that U(1) 
gauge theories in d = 2+ 1 dimensions coupled to matter are strongly coupled in the 
infra-red. In this regard, these theories differ from electromagnetism in d = 3 + 1. 


We can start by thinking classically. The Maxwell equations are 
1 V pH 
anaE" = 
Suppose that we put a test charge Q at the origin. The Maxwell equations reduce to 
V’A = Q F(x) 


which has the solution 


Ao = Q log ( : ) + constant 
2T To 


for some arbitrary ro. We learn that the potential energy V(r) between two charges, 
Q and —Q, separated by a distance r, increases logarithmically 


Vin Q log ( is ) + constant (8.2) 


2T To 


This is a form of confinement, but it’s an extremely mild form of confinement as the 
log function grows very slowly. For obvious reasons, it’s usually referred to as log 
confinement. 
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In the absence of matter, we can look for propagating degrees of freedom of the gauge 
field itself. As explained in the previous section, we expect the gauge field to have a 
single, propagating polarisation state in d = 2 + 1 dimensions. 


8.1.1 Monopole Operators 


Something special happens for U(1) gauge theories in d = 2 + 1 dimensions: they 
automatically come an associated global U(1) symmetry that we will call U(1)top, the 
“top” for “topological”. The associated current is 


1 V 

Joop a m” P Byg (8.3) 
which obeys the conservation condition 0, J{6, = 0 by the Bianchi identity on F». The 
associated conserved quantity is simply the magnetic flux 


1 
Qtop — ex dias T z fë B 


In quantum field theory, symmetries act on local operators. The operators that trans- 
form under U(1)top are not the usual fields of the theory. Rather, they are disorder 
operators, entirely analogous to the ’t Hooft lines that we introduced in Section 2.6. In 
the present context, they are referred to as monopole operators. 


We work in Euclidean space. A monopole operator M(x) inserted at a point x € R’ 
is defined in the path integral by requiring that we integrate over field configurations 
in which there is a Dirac monopole inserted inserted at the point x. This means that, 
for an S? surrounding x, we have 
af eg err, = 1 (8.4) 

At s2 4 P 
This operator creates a single unit of magnetic flux so that, in the presence of M(x), 
the topological current is no longer conserved; instead it has a source 


ô JÉ, = (x) (8.5) 


op ` 
Equivalently, the monopole operator is charged under U(1)top so that 


U(1)top : M(a) > eM (a) (8.6) 


The definition of monopole operators given above is somewhat abstract. As we will 
now see, in certain phases of the theory it is possible to give a more concrete definition. 
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Consider free Maxwell theory. Alternatively, consider U(1) gauge theory coupled to 
charged fields with masses m >> e°. In both cases, the theory lies in the Coulomb phase, 
meaning that low energy spectrum contains just a single, free massless photon. The 
partition function is particularly straightforward; ignoring gauge fixing terms, we have 


1 V 
Z= |DA, exp (- fas — EaP”) 


Because the action depends only on Fv, and not explicitly on A,, we can choose 
instead to integrate over the field strength. However, we shouldn’t integrate over all 
field strengths; in the absence of monopole operators, we should integrate only over 
those that satisfy the Bianchi identity «"”°0,,F,, = 0. We can do this by introducing a 
Lagrange multiplier field a(x), 


1 V a v 
Z= [PF wo exp (- faz — gate" + gI Opp) (8.7) 


If the field strength obeys the Dirac quantisation condition, then o has periodicity 27. 
But in this formulation, it is particularly straightforward to implement a monopole 
operator. We simply add to the path integral 


M(x) ~ et? (8.8) 


This ensures that the topological current has a source (8.5) or, equivalently, inserts a 
monopole at zx. 


We can now go one step further, and integrate out the field strength Fv. We’re left 
with an effective action for the Lagrange multiplier field ø(x) which, in this context, is 
usually referred to as the dual photon. We're left with the effective action, 


2 
Z = exp (- fae <5 0,00"0 ) (8.9) 


Clearly this describes a single, propagating degree of freedom. But this is what we 
expect for a photon in d = 2+ 1 which has just a single polarisation state. 


In this formulation, the global symmetry U(1)top is manifest, and is given by 
U(l)tcp : 7 WH o+a (8.10) 


This agrees with our expected symmetry transformation (8.6) given the identification 
(8.8). The associated current can be read off from (8.9); it is 


€ 
Jio = ane? ? (8.11) 
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There’s one, nice twist to this story. The theory (8.9) has a degeneracy of ground states, 
given by constant o € (0,27). These degenerate ground states reflects the fact that if 
we place some magnetic flux in the Coulomb phase then it spreads out. In any of these 
ground states, the global symmetry U(1)top acts like (8.10) and so is spontaneously 
broken. The associated Goldstone boson is simply ø itself. But this is equivalent to 
the original photon. We have the chain of ideas 


Coulomb Phase: Unbroken U(1) gauze < Spontaneously Broken U(1)top 
<4 Goldstone Mode = Photon 


A related set of ideas also holds in higher dimensions, but now with the U(1)top a 
generalised symmetry, which acts on higher dimensional objects, as we discussed in 
Section 3.6.2. d = 2 + 1 dimensions is special because the disorder operator M(x) is 
a local operator, ensuring that U (1)top is a standard global symmetry, rather than the 
less familiar generalised symmetry. 


8.2 The Abelian-Higgs Model 


We can get some more intuition for the role of monopole operators, and 3d gauge 
theories in general, by looking at the Abelian-Higgs model. This is a U(1) gauge 
theory coupled to a scalar field ¢ which we take to have charge 1. The action is 
1 À 

San = jes - aF + Duel? — mlo- Sis" (8.12) 
We will look at what happens to this theory as we vary the mass m? from positive 
to negative. This is a game that we’ve already played in both d = 3 + 1 dimensions 
(in Section 2.5.2) and in d = 1 + 1 dimensions (in Section 7.2). In both cases, the 
interesting physics came from vortices in the m? < 0 phase, and the same will be true 
here. 


When the mass is small, |m| < e?, the theory is strongly coupled in the infra-red. 
It is difficult to get a handle on the physics here, although we will ultimately be able 
understand what happens. In contrast, when |m| >> e?, we can first understand the 
dynamics of the scalar in a regime where the gauge field is weakly coupled, and then 
figure out what’s left. We first look at these two phases. 


m? > et: When m? > 0 we can simply integrate out the scalar, to leave ourselves 
with free Maxwell theory below the scale of m?. This is the gapless Coulomb phase, in 
which we have an unbroken U(1) gauge symmetry. As we explained above, this means 
that the global symmetry U(1)top is spontaneously broken. The Goldstone mode is the 
photon. 
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There are also massive, charged excitations in this phase that come from the ¢ field. 
They interact through the Coulomb force which means that charges of opposite sign 
experience a logarithmically confining potential (8.2). 


m? < —e*: This is the Higgs phase. The scalar condenses, 


m2 


2_ 
oP =" 


giving the photon a mass. This phase is gapped. The U(1) gauge Symmetry is sponta- 
neously broken. But now the global topological symmetry U(1)top is unbroken. 


The finite energy states of the theory which carry non-vanishing Qtop charge are 
the vortices. We discussed these in detail in both d = 3+ 1 dimensions where the 
vortices are strings (see section 2.5.2) and in d = 1+ 1 dimensions where the vortices 
are instantons (see Section 7.2). In d = 2 + 1, vortices are particle-like excitations. 
They are classical configurations in which the phase of @ winds asymptotically in the 
spatial plane R?. They have finite energy, and finite quantised magnetic flux 


1 
fa o= = | Pe B= Quo EZ 


This is what monopole operators do in the Higgs phase: they create vortices. The 
upshot is that we can characterise the Higgs phase of the theory as 


Higgs Phase: Spontaneously broken U(1)gange <+ Unbroken U(1)top 
<= Charged Excitation = Vortex 


m? =0: In d = 2+ 1, the two phases at m? > 0 and m? < 0 are clearly different 
since they have a different global symmetry U(1)top. (This is in contrast to the story 
in d = 1+ 1 where vortices are instantons and blur the distinction between the two 
phases. ) 


We can ask: what happens as we dial m? from positive to negative. We expect a 
phase transition to occur at some point, we which we heuristically refer to as m? = 0. 
(In practice, this point can be shifted away from zero). Is this a first order phase 
transition, or second order? If second order, what universality class does the theory 
lie in? Because the theory is strongly coupled in the regime |m| < e? it is difficult to 
perform any quantitative calculations to answer this question. Instead, we will guess. 
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To guide our guess, we use the symmetries of the problem. Since we have identified 
the global U (1)top symmetry as distinguishing phases, it seems reasonable to postulate 
that the phase transition lies in the same universality class as other theories governed 
by a U(1) global symmetry. This turns out to be true, and underlies a rather beautiful 
feature of 3d gauge theories known as particle-vortex duality. 


8.2.1 Particle-Vortex Duality 


In quantum field theories, there are very often two kinds of particle excitations that can 
appear. The first kind is the familiar excitation that we get when we quantise a local 
field. This is that kind that we learned about in our Quantum Field Theory course. 
The second kind we’ve seen a number of times in these lectures: they are solitons. 


Despite the fact that these two kinds of particles arise in different ways, there is 
really little difference between them in the quantum theory. In particular, both are 
described as states in the Fock space. Typically at weak coupling, the solitons are 
much heavier than the “elementary particles”, but that’s more a limitation of our need 
to work at weak coupling. It may be — and often is — that as we move into strongly 
coupled regimes, the solitons become light. 


This opens up an intriguing possibility. Is it possible to write down a different 
quantum field theory in which the roles of solitons and elementary particles are reversed. 
These two quantum field theories would describe the same physics, but what appears 
as a soliton in one would appear as an elementary particle in the other, and vice versa. 
This is referred to a duality. 


In fact, we’ve already met a simple example of a duality in these lectures. In Section 
7.5, we used bosonization to demonstrate the equivalence between a massive fermion 
and the Sine-Gordon model. The elementary fermion arises as a kink in the Sine-Gordon 
model. 


Typically, dualities get harder to construct with any conviction as the number of 
dimensions increases. There wonderful examples of dualities in d = 3+ 1, which ex- 
change electric and magnetic excitations, but they need supersymmetry to keep control 
over the dynamics and so are beyond the scope of these lectures. However, things are 
somewhat easier in d = 2+ 1. Here we do have examples of dualities. In contrast to 
the bosonization story of Section 7.5, we are unable to prove the d = 2+ 1 dualities 
from first principles, but nonetheless have convincing evidence that they are true. We 
will see a number of these dualities as we proceed. 
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As we’ve seen above, in d = 2 + 1 dimensions the appropriate solitons are vortices. 
We will now propose a second theory, whose classical dynamics is different from the 
Abelian-Higgs model (8.12), but whose quantum dynamics is argued to be identical. 
The vortices in one theory are identified with the elementary particles of the other. For 
this reason, the claimed equivalence of the two theories is referred to as particle-vortex 
duality. 


The X Y-Model 


The theory which is claimed to be dual to the 3d Abelian-Higgs model is simply a 
theory of a complex scalar field ¢, without any gauge field, 


Sxy = f da 8- èl- Zia (8.13) 


This is known as the XY-Model. At first glance, the physics of this model is rather 
different from the XY-model. Indeed, at first glance it appears to have fewer degrees 
of freedom because it is missing the gauge field. Nonetheless, as we now explain, they 
describe the same physics, albeit in a non-obvious and interesting way. 


Let’s first address the issue of degrees of freedom. The XY-model clearly has two 
degrees of freedom in the UV where it is weakly coupled. But the Abelian-Higgs model 
has the same number: the gauge redundancy removes one degree of freedom from ø, 
but this is replenished by the single polarization state of the photon. We learn an 
interesting lesson: gauging a U (1) symmetry in d = 2 + 1 changes the dynamics, but 
does not change the overall number of degrees of freedom. This will be important in 
later developments. 


We can also match the symmetries between the XY-model and the Abelian-Higgs 
model. The XY-model clearly has a U (1) global symmetry which rotates the phase of 
@. The associated current is 


Thy =i (tað — (045")6) 


The Abelian-Higgs model also has a single global symmetry that we called U(1)top. You 
might worry that the Abelian-Higgs model also has a gauge symmetry, which is clearly 
not shared by the XY-model. But, as we have stressed many times, gauge symmetries 
are not symmetries at all, but redundancies. This gives another important lesson: there 
is no need for gauge symmetries to match on both sides of a duality. 


We can now look at how the physics of the XY-model changes as we vary the mass: 
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m2 > 0: This is a gapped phase. The ¢ excitations are massive and carry charge 
under the unbroken U(1) global symmetry. We see that, at least with broad brush, this 
looks similar to the the Higgs phase of the Abelian-Higgs model, in which the U(1)top 
symmetry was unbroken. In that case, the vortices carried charge under U(1)top- 


m2 <0: In this phase, é gets a vacuum expectation value and the U (1) global 
symmetry is broken. We can write ¢ = pe’. The fluctuations of p are massive, while 
the o field is massless: it is the Goldstone mode for the broken U(1). Notice that we’ve 
given this field the same name as the dual photon in the Abelian-Higgs model. This is 


not a coincidence. 


Again, with broad brush this looks similar to the gapless Coulomb phase of the 
Abelian-Higgs model. However, the Coulomb phase was also characterised by the 
existence of massive, charged ¢ excitations that were logarithmically confined. Can we 
see similar excitations in the XY-model? The answer is yes. 


The ordered phase of the XY-model also has vortices. As before, these arise from 
the phase of ¢ winding asymptotically, but now there is no gauge field to cancel the 
log divergence in their energy, 


oie 1 i o0 2 > 
pee lie’ = faar r -lð +... = 2r | dr |6P2 +... 
The energy of a single vortex is logarithmically divergent. But this divergence can 


be cancelled by placing an anti-vortex at some distance r. It’s not hard to convince 
yourself that the logarithm reappears in the potential energy between the vortex and 


1 r 
= a... — 
y 2T 08 (=) 


for some cut-off rp. In other words, the vortices are logarithmically confined. This, of 


anti-vortex, which scales as 


course, is the same behaviour exhibited by charged particles in 3d electromagnetism. 


m? = 0: Lying between the two phases above is a critical point. Once again, we are 
being a little careless in describing this as sitting at m = 0; strictly, you should tune 
both m and the other parameters to hit the critical point. 


This time, the physics of the critical point is well understood: this is the XY Wilson- 
Fischer fixed point. We studied this in some detail in the lectures on Statistical Field 
Theory using the epsilon expansion. 
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The essence of particle-vortex duality is the claim that the Abelian-Higgs model also 
flows to the XY Wilson-Fisher fixed point at m = 0. This claim can be traced back to 
work of Peskin in the 1970s, but was brought to prominence by Dasgupta and Halperin 
in the early 1980s. Given the similarity in their phase structure, this would seem to 
be a reasonable claim. There is currently no proof of the duality, but there is now 
convincing numerical evidence that it is true. 


The Duality Dictionary 


The key to particle-vortex duality is really the idea of universality: the two theories 
(8.12) and (8.13) share the same critical point. We can then attempt to map the oper- 
ators of the two theories at the critical point. We have only an incomplete dictionary 
at the moment, but our discussion above allows us to start to fill in some entries. For 
example, we have seen how the currents match on both sides 


H ) y H 
Jin Jy Y 


With two theories flowing to the same critical point, we can now turn on relevant 
operators in each. As long as we turn on the same relevant operator, we are guaranteed 
that the theories coincide in the neighbourhood of the fixed points. We have seen above 
how this plays out: when the scalar condenses in one theory, it matches the phase in 
which the scalar is not condensed in the other. Roughly speaking, we have 


m x= —m? 


Alternatively, we can write this in terms of the relevant operators at the critical point 
as 


Jo? <> -lọ (8.14) 


although since the critical points are strongly coupled, this relation is likely to have 
corrections, with operators on both sides mixing with others. 


Far from the critical point, we have seen that the theories have the same qualitative 
features. In particular, the duality inherits its name from the map between massive 
excitations, 


gauge vortex +—> $ excitation 


@ excitation <-> global vortex 


Only the first of these describes a map between finite energy excitations. In this case, it 
is better to phrase the map in terms of local operators, rather than solitons: the essence 
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of particle-vortex duality is that the monopole operator on one side is a traditional field 
in the Lagrangian on the other, 


M(x) ++ gl) (8.15) 


We could ask: do the interactions between these massive excitations agree in detail? 
The answer is most likely no. One could add irrelevant operators to both the Abelian- 
Higgs model and the XY model which will affect the interactions between these massive 
particles. We would have to work much harder to get quantitative agreement away 
from the critical point. For what it’s worth, it is possible to do this matching in certain 
supersymmetric versions of the duality. Here, particle-vortex duality is referred to as 
3d mirror symmetry. 


The View from Statistical Physics 


The claim of particle-vortex duality offers a very clear experimental prediction. Al 
though we have phrased our discussion in the context of physics in d = 2+1 dimensions, 
everything goes through in the the Euclidean d = 3+0 world. Here, the theories (8.12) 
and (8.13) can be viewed as statistical field theories, with the path integral describing 
thermal rather than quantum fluctuations. More details can be found in the lecture 
notes on Statistical Field Theory. 


In this context, the 3d XY-model (8.13) governs the phase transition of a number of 
systems, including the superfluid transition of liquid helium. Similarly, the 3d Abelian- 
Higgs model (8.12) governs the superconducting phase transition, with the field strength 
Fij, i, j = 1,2,3 describing the fluctuating magnetic field. 


In both cases, the mass? term determines the deviation from the critical temperature 
T. at which the phase transition occurs. But that makes the map (8.14) between the 
masses rather surprising. It means that the duality maps the high temperature phase 
of the superfluid to the low temperature phase of the superconductor, and vice versa. 


The claim that both theories share a critical point then becomes the claim that the 
two phase transitions have the same critical exponents. Experimentally, however, this 
claim is incorrect: the two phase transitions are not the same. While the superfluid 
transition exhibits the XY Wilson-Fisher exponents, the superconducting transition 
has mean field exponents. It would seem that particle-vortex duality has been ruled 
out experimentally! 
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In fact this is too quick. Recall that the X Y-model has two critical points. The mean 
field critical point is unstable, with |¢|* a relevant operator that drives the theory to 
the Wilson-Fisher point. The same should be true of the Abelian-Higgs model. It is 
thought that the mean field exponents seen in the superconducting transition reflect 
the fact that the experiments haven’t got close enough to the true critical point, and 
are instead probing the unstable mean field point. Calculations suggest that one would 
start to see Wilson-Fisher critical exponents in the superconducting transition only at 
T — T, ~ 107° K. Such a level of precision is not technologically feasible. 


But this brings its own issues. It appears that we have a system in Nature which 
is fine-tuned. The natural scale of the superconducting phase transition is T, ~ 10 K 
or so. In the experiments, we tune the coefficient of |¢|? by hand to hit the critical 
temperature. But why is the coefficient of the |é|* relevant operator so small that it 
only shows up when T — T, ~ 107° K? This is similar to the famous hierarchy problem 
in the Standard Model, where again the coefficient of a relevant operator appears to 
be fine-tuned. 


Particle physicists have sleepless nights over fine tuning, and desperately search for 
an explanation. In large part, this is because of experience with RG in statistical 
physics, where any fine-tuning seen in Nature must also have an explanation. In the 
case of superconductors, the apparent fine tuning is understood: it arises because the 
underlying scalar field ¢ is not fundamental, but instead comprises of a Cooper pair 
of electrons. (The analogous possibility for the Higgs fine tuning goes by the name 
of technicolour.) A full explanation would take us too far from the purpose of these 
lectures, but this suffices to ensure that the smallness of the |¢|* relevant operator seen 
in the superconducting transition is technically natural. 


8.3 Confinement in d = 2+ 1 Electromagnetism 


We’ve seen that classical electromagnetism in d = 2+ 1 dimensions confines particles, 
but only weakly with a log potential 


There is, however, an important effect in the quantum theory that turns the logarithmic 
confining potential into a more powerful linearly confining potential. This effect, first 
discovered by Polyakov, is due to instantons. 


We’ve met instantons in d = 3+ 1 Yang-Mills theory in Section 2.3, and again in 
the d = 1+ 1 Abelian-Higgs model in Section 7.2. In the latter case, vortices that play 
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the role of instantons. Now that we are living in d = 2 + 1 dimensions, the instantons 
should be objects localised in three Euclidean dimensions. But these are very familiar: 
they are magnetic monopoles. 


We've already introduced the idea of monopole operators in Section 8.1. These can 
be thought of Dirac monopoles at a point. They are not quite what we want for the 
present purposes. As a starting point for a semi-classical calculation, we would like the 
monopoles to be smooth configurations with finite action. But we’ve seen such objects 
before: we can use the ’t Hooft Polyakov monopole described in Section 2.8. 


Recall that the ’t Hooft Polyakov monopoles arise in an SU(2) gauge theory (or, 
more generally, any non-Abelian gauge theory) broken down to its Cartan subalgebra. 
To achieve this, we couple the SU(2) gauge theory to a real, adjoint scalar ¢ and work 
with the action 


2 
S= Oy - ar E E E Y ee (8.16) 
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The ground state of the system has, up to a gauge transformation, ¢ = va*, and breaks 
the gauge symmetry 


SU(2) > U(1) 


At low energies, the spectrum contains just a single massless photon and looks like 
pure electromagnetism. In addition, there is a neutral scalar with mass ~ V/Agu and a 
charged W-boson of mass ~ v. 


In this way, we can view the model as U(1) gauge theory, with a UV cut-off at the 
scale v. The dimensionless gauge coupling constant is g?/v and to trust any semi- 
classical calculation, we must take g?/u < 1. 


8.3.1 Monopoles as Instantons 


Our main reason for introducing the action (8.16) is that, in Euclidean spacetime, it 
admits smooth monopole solutions. These are the ’t Hooft Polyakov monopoles that 
we introduced in Section 2.8, but now localised in Euclidean spacetime meaning that 
they play the role of instantons, rather than particles. Here we recount the basics. 


The existence of the monopoles can be traced to topology. Any finite action config- 
uration must obey tr ¢? > v? as x — oo. This defines a sphere S? in field space, so all 
finite action configurations are classified by a winding number I2(S?) = Z, defined as 
1 iik 

f P Sre ent OR € Z (8.17) 
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However, winding comes at a cost. Any purely scalar configuration that winds has 
linearly divergent action. This can be compensated by turning on a gauge field and 
this, in turn, endows the soliton with magnetic charge in the unbroken U(1) C SU(2), 
(2.91), 
1 Bex L ajk 
m = —- | dS; =e" tr (Fik) = 4rv 
v 2 
The solution for a single monopole, with winding v = 1, has asymptotic form 
a j 
> g and Af —> = as £ — œO 
r y 
The action of this configuration is finite, and given by 
STU 
Smono = 3 f (Ag?) 
g 
with f(Ag?) a monotonically increasing function. It has the property that f (0) = 1, so 
that the action above coincides with that of a BPS monopole (2.93) when A = 0. 


We’re used to the idea that finite action configurations in Euclidean space tunnel 
between different vacua of the theory. But what vacua does the monopole tunnel 
between? Clearly, it changes the magnetic flux ® = f d°x B on a spatial slice. If we 
were living on a compact space, this would change the energy of a state, which is given 
by 


1 1 Cy" 
E 2. pe 
AB= | de 5B 5 Area (=) 


with “Area” the area of a spatial slice. However, as the area tends to infinity, the flux 
is suitably diluted and the cost in energy is vanishingly small. These are the different 
vacua that the monopoles tunnel between. 


A Dilute Gas of Monopoles and Anti-Monopoles 


With our monopole solution in hand, we can use it as the starting point for a semi- 
classical evaluation of the path integral. We should be getting used to this by now, and 
we follow the structure of the calculation laid out in Section 2.3, and again in Section 
Ta 


One key step in the calculation is to invoke the use of a dilute gas of instantons. 
In the present case, this means we treat configurations of widely separated monopoles 
and anti-monopoles, with magnetic charges m; = +47, as saddle points in the path 
integral. In the previous situations, we argued that the action of a dilute gas of N 
(anti)-instantons was roughly S ~ N Sinst, reflecting the fact that these are approximate 
solutions when the objects are far separated. 
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For monopoles, however, we should treat this step more carefully. Viewed as particles 
in d = 3+ 1 dimensions, we know that the energy will pick up contributions from the 
long range Coulomb forces between the monopoles. This translates into a contribution 
to the action in our context. If a monopole of charge m; = +47 sits at position X;, the 
total action will be 


S= Sono > (E) Dp 22 Sa 


where the second term reflects the long range Coulomb interaction. 


We evaluate the path integral by summing over these dilute gas configurations, con- 
taining N constituents of either type. This results in the expression, 


F= 3 D mt Ke Sere) y* [Tex a (=g e- date i) (8.18) 


N=0 m;= 


Here K is the usual contribution from one-loop determinants and Jacobian factors. We 
could compute it, but it does not give any qualitatively new insights into the physics 
so we will not. The second factor in the expression above is the novelty. When the 
instantons are non-interacting, this just gives a power of VN to the path integral, 
with V the spacetime volume. Now that we have long range interactions between the 
instantons, we must work a little harder. 


There is a useful way to rewrite the final expression. We use the fact that the 1/r 
factor also arises in the Green’s function of the Laplacian in three dimensions. In 
general, for a scalar field o(x), and any fixed function f(x), we have 


[Pe exp (= fate 30,0)? + Fla)o(2)) ~e (zy f Pedy BLU) 


Using this, we rewrite the sum over the Coulomb gas in (8.18) as a path integral 


a (=z Sig? i LK dii ig) = SPee- f ee Zeer ERI z— x») 


(We used a very similar trick in the lectures on Statistical Field Theory when treating 
the 2d Coulomb gas in the XY model.) 


In fact, we’ve met this field o(x) before: it is precisely the dual photon that we 
introduced in Section 8.1. To see this, note that the coupling to the magnetic charge 
above coincides with the coupling in (8.7) 
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Continuing with our calculation, the partition function becomes 


2 oo —S, N 
g K e~ Smono 
Z= fro exp (- faz £10.07) ` L a 


N=0 
x / lex: ` eo Be Li (Xi) 
i=1 my=t4r 


= f Do exp (- fer E0) 2 a Ga I Pa cos22))) 
= [De exp (— f e E,0)? KeS cos(20)) (8.19) 


We can now see the net effect of the instantons: they have generated a potential for 
the dual photon ø. Expanding about the minimum at ø = 0, we find that the dual 
photon has acquired a mass, 


i An? K e7 Smone 


photon = 2 
g 


m 


On dimensional grounds, the one-loop determinants and Jacobian factors that we 
lumped into the constant K must have dimension [K] = 3. For small À, it turns 
out to scale as K ~ v™/?/g. At weak coupling g?/v < 1 and Smono >> 1, where our 
semi-classical analysis is valid, we find that the mass of the dual photon is exponentially 
smaller than all other scales in the game. This means that we can read off the effective 
action from (8.19) 


2 
Sef = jes 5 (8,0) + Kem cos(20) (8.20) 
T 


We recognise this as the Sine-Gordon model that we met in d = 1 + 1 dimensions in 
Section 7.5.5. Now it arises as the effective, low-energy description of a gauge theory 
in d = 2 + 1 dimensions. 


8.3.2 Confinement 


What does it mean for the dual photon to get a mass? To answer this, we can see how 
the ground state responds to various provocations. 


First, let’s try to turn on an electric field in the ground state, say Fo, # 0. To 
understand what this means in terms of the dual photon, we need to relate F y with 
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o. We can do this by comparing our expressions for the topological current (8.3) and 
(8.11), 


1 
Jop = Ge Fw = fap 2o 7 


We find that an electric field corresponds to 


e2 
Foi = Ooo 
2T 


However, the configuration 20 = constant does not obey the equations of motion of 
our effective action (8.20). This means that the vacuum does not support a constant, 
background electric field. Instead, solutions to the equations of motion with o 4 0 
are kinks, or domain walls, in which ø interpolates from, say, @ = 0 as £z —> —o0, to 
o = 27 as £o + +00. We already met these kinks in Section 7.5.5 when discussing the 
Sine-Gordon model in d = 1+ 1 dimensions. In the present context, the domain walls 
are string-like configurations stretched in the x! direction, with width ~ 1/mpnoton in 
the x? direction, and tension, 


4 -—>s—— 
Sy; = — 2K g2e~ Smono 
T 


a result which follows from translating our earlier result (7.69). (Up until now, we’ve 
always referred to the string tension as ø. Obviously that’s a bad choice for our current 
discussion.) 


The domain wall, or string, is a collimated flux tube of electric field Fo; # 0. This is 
the expected behaviour of a gauge theory that is linearly confining. In other words, the 
classical log potential (8.2) of 3d gauge theories has been replaced with a more severe, 


V(r) =r 


We could explicitly compute the Wilson loop in this framework and confirm that it 
does indeed exhibit an area law. 


We have seen that 3d electromagnetism exhibits linear confinement due to instantons 
which, in this context, are monopoles. It is crucial that these monopoles have a finite 
action, which we achieved by embedding the theory in a non-Abelian gauge group. If 
we introduce other UV completions of the theory, with a finite cut-off, Ayy, these too 
will have monopoles, typically with action Smono ~ Auv/g?. (Lattice gauge theory 
provides a good example of this). These too will then exhibit linear confinement. 
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8.4 Chern-Simons Theory 


Gauge theories in d = 2+ 1 dimensions admit a rather special interaction that does 
not have a counterpart in even spacetime dimensions. This is the famous Chern- 
Simons interaction. It plays a key role in many areas of theoretical and mathematical 
physics, from the physics of the quantum Hall effect, to the mathematics of the knot 
invariants. Many details on the former application can be found in the lecture notes 
on the Quantum Hall Effect. 


For U(1) gauge theory, the Chern-Simons term takes the form 
k 
Segoe f d’x PAOLA, (8.21) 


We could consider this term on its own, or in conjunction with the Maxwell action 
(8.1). In either case, the dimensionless coefficient k is known as the level. We can 
write down similar terms in any odd spacetime dimension; we briefly met the d = 4+1 
dimensional version in Section 4.4.1. 


Let’s start by studying the symmetries of the Chern-Simons action. It is Lorentz 
invariant, courtesy of the «“”? invariant tensor. At an operational level, the existence 
of this tensor means that the term is exclusive to d = 2+ 1 dimensions. However, 
this same e“”? tensor means that the Chern-Simons interaction breaks both parity and 
time-reversal invariance. Here we focus on parity. In even dimensions we can always 
take parity to act as x +» —x (see, for example, (1.25)). But, in odd dimensions, this 
coincides with a rotation. We should instead take parity to flip the sign of just a single 
spatial coordinate, 


poe , goes’ , Pow (8.22) 


and, correspondingly, Ag — Ao, Ay —> —A; and Ag — A». This means that, as 
advertised, the Chern-Simons action is odd under parity. 


8.4.1 Quantisation of the Chern-Simons level 


At first glance, it’s not obvious that the Chern-Simons term is gauge invariant since it 
depends explicitly on A,,. However, under a gauge transformation, A, — A, +0,W, we 
have 


Sos > Sos + E pez On (we P0, Ap) 


The change is a total derivative. In many situations we can simply throw this total 
derivative away and the Chern-Simons term is gauge invariant. However, there are 
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some situations where the total derivative does not vanish. As we will now show, in 
these cases the Chern-Simons partition function is gauge invariant provided that 


kez (8.23) 


For Abelian Chern-Simons theories, it’s a little subtle to see the requirement (8.23) 
since it only shows up in the presence of magnetic flux. (This is to be contrasted with 
the situation for non-Abelian Chern-Simons theories described in Section 8.4.3 where 
one can see the analogous quantisation condition around the vacuum state.) 


Perhaps the simplest way is to consider the theory on Euclidean spacetime S! x S?. 
We then add a single unit of magnetic flux through the S?. As we’ve seen many times 
in these lectures, if we take the gauge group to compact U (1), the flux is quantised, in 
the minimal unit 


1 
2T s2 


We then consider large gauge transformations of this background that wind around 
the St. We denote the radius of this St as R, and parameterise it by the coordinate 
x° € [0,27R). Consider a gauge transformation A, —> A, + ô w which winds around 
the St, with 


w= (8.25) 


Under such a transformation, any matter field @ with charge q € Z remains single 
valued, since ¢ > e“47/"¢. Even in the absence of charged matter, the statement that 
we're working with a compact U(1) gauge group, rather than a non-compact R gauge 
group, means that the theory admits fluxes (8.24) and gauge transformations (8.25). 


Under the gauge transformation (8.25), we have 


il 


This means that the zero mode of Ap is a periodic variable, with periodicity 1/R. 
(We came to the same conclusion in Section 7.1 where we discussed two dimensional 
electromagnetism on a spatial circle.) 


We can now see what becomes of our Chern-Simons action under such a gauge 
transformation? Evaluated on a configuration with constant Ag, we have 


k 
Scs = = fë: AoFi2 + Ai Foo + AoFo1 
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Now it’s tempting to throw away the last two terms when evaluating this on our back- 
ground. But we should be careful as it’s topologically non-trivial configuration. We can 
safely set all terms with o to zero, but integrating by parts on the spatial derivatives 
we get an extra factor of 2, 


Sos = E / dt Agila (8.27) 
Evaluated on the flux (8.24), with constant Ao = a, we have 
Scs = 2rk Ra 
And under the gauge transformation (8.26), we have 
Sos > Scs + 2rk 


The Chern-Simons action is not gauge invariant. But all is not lost. The partition 
function depends only on e’°°s and this remains gauge invariant provided k € Z, which 
is our claimed result. This last part of the argument is exactly the same as the one we 
met in Section 2.1.3 when we discussed Chern-Simons terms in quantum mechanics, 
and in a number of other places when we’ve discussed WZW terms. 


Chern-Simons Theories and Spinors 


There are further subtleties associated to the factor of 2 above, which we flag up here. 
A better way to think about the Chern-Simons theory on a 3-dimensional manifold M, 
is by viewing this as the boundary of 4-dimensional manifold X. The story is simplest 
in the language of forms, where we have 


k 
M=0X X 


T 


The fact that the Chern-Simons term is related to the 4-dimensional 0 term was antic- 
ipated in (1.12) Written in this way, the Chern-Simons term is clearly gauge invariant 
since it depends only on F and not A. Our worry, however, has transmuted to the ques- 
tion of whether it depends on the choice of 4-manifold X. How can we be sure that 
we get the same answer if we chose a different 4-manifold X’ which also has boundary 
OX' = M? The difference between the two answers involves the integral over the com- 
pact manifold Y = X U X’, formed by gluing together X and X’ along their common 
boundary, 


k 
Scs|A; X] — Scs A; Xx" = =| FAF 
Y 


T 
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We’re safe provided that this difference is is 27 times an integer, since then the partition 
function, which depends on e*°°s, is independent on the choice of X. Clearly this 
requires 


DEL A 2 EZ (8.28) 
2 Jy 2r 27 
So is this true? Well, actually no. Or, at least, not always! It turns out that (8.28) is 
true only if the 4-manifold Y admits spinors or, more precisely, admits a mathematical 
object called a spin structure which tells you whether or not a fermion picks up a 
minus sign when it is transported around a loop. Any manifold that admits such a spin 
structure is called a spin manifold. And (8.28) holds whenever Y is a spin manifold. 


For example, Y = Tt, Y = S? x S? and Y = S4 are all spin manifolds. In these 
cases (8.28) holds. To give you some sense of how this works, suppose that we take 
Y = S? x S?. Dirac quantisation means that the flux through each of the spheres must 
be a multiple of 27. If we take F = F, + Fy, with F, giving flux through the nt? 


2-sphere, then 
1 F F 
J Zng= j Af Rez 
2 S2xS2 2T 2T s2 s2 


with the factor of 2 coming from the cross-term. 


However, there are 4-manifolds Y which do not admit a spin structure. The simplest 
example is Y = CP”. In this case, f, (F/2r)A(F/2r) is an integer, not an even integer. 


The upshot of this is that the Chern-Simons level k for a U(1) gauge group can be 
integer valued provided that the theory admits fermions. But, otherwise, must be an 
even integer. The simple “integrate by parts to get an extra factor of 2” prescription 
that we used to get (8.27) sweeps all of these subtleties under the rug. 


8.4.2 A Topological Phase of Matter 


So what is the physics of Chern-Simons theory? Despite the simplicity of the action, 
the physics is remarkably subtle. Let’s start with the basics. We’ll take the d = 2 + 1 
dimensional gauge field to be governed by 


1 k 
S = S Maxwell + Sos = fë n ga tE" F pE AuOvAp 


We can start by gaining some intuition from the classical equation of motion, 


2 
MT = 0 8.29 
H + Jn (8.29) 
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In terms of the electric field E; = Fo; and the magnetic field B = Fj, Gauss’ law 
becomes 
he? 


oE; = — 
21 


(8.30) 


which tells us that a magnetic field acts as a source for the electric field. This simple ob- 
servation will underlie much of the physics of Section 8.6 where we discuss bosonization 
in 3d. 


What are the propagating excitations of the equations of motion (8.29)? Taking one 
further derivative of the equations of motion, we can decouple electric and magnetic 
fields to show that each component obeys the massive wave equation, 


ke2 2 ke2 2 
e- (3) r =#B- ($) B=0 
A T 


(To do this, it’s perhaps simplest to first define the field G” = e’ F,, and show that 
G"! obeys the massive wave equation.) We see that, at least classically, the excitations 
do not propagate at the speed of light. Instead, they are exponentially damped. In the 
quantum theory, which means that we have a theory of massive excitations. The mass 
of the photon is 


ke? 
mcs = 57 
27 
Yet again, we find ourselves in a situation with a massive gauge boson. How should we 


think of this phase? 


We’ve already met other situations in d = 2 + 1 dimensions where the photon gets a 
mass. There is the confining phase, driven by instantons, that we saw in Section 8.3, 
in which the Wilson loop has an area law. And there is, of course, the Higgs phase 
in which a charged scalar field condenses and the Wilson line has a perimeter law. It 
turns out that the Chern-Simons phase differs from both of these. Instead, it is a novel 
phase of matter, referred to as a topological phase. 


Topological phases of matter are subtle. They typically have interesting things go- 
ing on at energies E < mgços way below the gap, even though there are no physical 
excitations beyond the vacuum. We’ll explain below what these interesting things are. 
Chern-Simons Terms are Topological 


Before we address the novel physics of Chern-Simons theory, we first point out an 
important property of the Chern-Simons action (8.21): it doesn’t depend on the metric 
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of the background spacetime manifold. It depends only on the topology of the manifold. 
To see this, let’s first look at the Maxwell action for comparison. If we were to couple 
this to a background metric guv, the action becomes 


1 
SMaxwell = [er a T 4e2 g g” Fv Foo 


We see that the metric plays two roles: first, it is needed to raise the indices when 
contracting fuf”; second it provides a measure \/—g (the volume form) which allows 
us to integrate in a diffeomorphism invariant way. Recall from our first lectures on 
Quantum Field Theory that this allows us to quickly construct the stress-tensor of the 
theory by differentiating with respect to the metric, 


2 a 
Vg Juw 


In contrast, we have no need to introduce a metric when generalising (8.21) to curved 
spacetime. This is best stated in the language of differential geometry: A A^ dA is a 


ph 


3-form, and we can quite happily integrate this over any three-dimensional manifold 


k 
Sos = È f AndA 
At 


This means that pure Chern-Simons theory knows nothing length scales. In particular, 
the Wilson loop can exhibit neither area nor perimeter law, since both of these are 
statements about lengths. Moreover, pure Chern-Simons theory has vanishing stress 
tensor. 


Chern-Simons Theory on a Torus 


If Chern-Simons theory has vanishing stress tensor, and no physical excitations, then 
what can it possibly do? The answer is that the theory responds to low-energy probes 
in interesting ways. 


Here is a simple, yet dramatic way to probe the theory. We will place it on a spatial 2- 
dimensional manifold X. As we have seen, Chern-Simons theory knows nothing about 
the metric on X. However, as we now show, it does know about the topology and 
responds accordingly. 


For pure Chern-Simons theory (or, equivalently, the e? — oo limit of Maxwell-Chern- 
Simons theory), Gauss’ law (8.30) becomes 


Fio =0 
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Figure 54: Figure 55: 


Although this equation is very simple, it can still have interesting solutions if the 
background has some non-trivial topology. These are called, for obvious reason, flat 
connections. It’s simple to see that such solutions exist on the torus X = T?, where 
one example is to simply set each A; to be constant. Our first task is to find a gauge- 
invariant way to parameterise this space of solutions. 


We’ll denote the radii of the two circles of the torus T? = St x St as Ri and Ro. 
We’ll denote two corresponding non-contractible curves shown in the figure as 7, and 
y2. The simplest way to build a gauge invariant object from a gauge connection is to 


Wi = $ dx! A; 
Vi 


This is invariant under most gauge transformations, but not those that wind around 


integrate 


the circle. By the same kind of arguments that led us to (8.26), we can always construct 
gauge transformations which shift A; > A; + 1/R;, and hence w; > w; + 27. The 
correct gauge invariant objects to parameterise the solutions are therefore the Wilson 


loops 
W; = exp (if Aye’ = ei 
Vi 


Because the Chern-Simons theory is first order in time derivatives, these Wilson loops 
are really parameterising the phase space of solutions, rather than the configuration 
space. Moreover, because the Wilson loops are complex numbers of unit modulus, 
the phase space is compact. On general grounds, we expect that when we quantise a 
compact phase space, we get a finite-dimensional Hilbert space. (We met an example of 
this in Section 2.1.3 when first describing Wilson lines.) Our next task is to understand 
how to quantise the space of flat connections. 
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The canonical commutation relations can be read off from the Chern-Simons action 
(8.21) 
2ri 


[A (x), A2(x")] = t afa —2') => jw, w] = 7 


The algebraic relation obeyed by the Wilson loops then follows from the usual Baker- 
Campbell-Hausdorff formula, 


iw2 _ elwi,w]/2,i(wit+wa) 


e œe 


which tells us that 
Wi W2 = eur WaW: (8.31) 


But such an algebra of operators can’t be realised on a single vacuum state. This imme- 
diately tells us that the ground state must be degenerate. The smallest representation 
of (8.31) has dimension k, with the action 


Wijn) = e??™/Fln) and Wa|n) = |n +1) 


We have seen that on a torus © = T?, an Abelian Chern-Simons theory has k degenerate 
ground states. The generalisation of this argument to a genus-g Riemann surface tells 
us that the ground state must have degeneracy k9. Notice that we don’t have to say 
anything about the shape or sizes of these manifolds. The number of ground states 
depends only on the topology. This is an example of topological order. 


8.4.3 Non-Abelian Chern-Simons Theories 


We’ve not had much to say about non-Abelian gauge theories in low dimensions. This 
is not because they’re boring, but simply because there is enough to keep us busy 
elsewhere. Here we make an exception and give a brief description of non-Abelian 
Chern-Simons theory. 


Like Yang-Mills, Chern-Simons is based on a Lie algebra valued gauge connection 
A, The non-Abelian Chern-Simons action is 


k 2i 
Sos = = f dx e”. tr (4,04, = Awl An) (8.32) 
We've met this term before: the theta term in d = 3 + 1 dimensions can be written as 
a derivative of the Chern-Simons term (2.24). (It also arose in the same context when 
discussing canonical quantisation of Yang-Mills (2.35).) Chern-Simons theories with 
gauge group G and level k are sometimes denoted as Gx. 
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Once again, we will find that the level must be integer, k € Z. This time, however, 
the computation is more direct than in the Abelian case. Under a gauge transformation, 
we have 


A, > OF A006, 


with Q € G. The field strength transforms as Fi, —> Oe aQ. A simple calculation 
shows that the Chern-Simons action changes as 


Scs > Sos + + fèr ferro, (LQ Q tap) + S det ((Q778,.Q) (Q*,.Q)(Q*4,Q)) \ 


The first term is a total derivative. The same kind of term arose in Abelian Chern- 
Simons theories. However, the second term is novel to non-Abelian gauge theories, 
and this is where the quantisation requirement now comes from. In fact, we have seen 
this calculation before in Section 2.2.2 when discussing the theta angle in d = 3 + 1 
Yang-Mills. On a spacetime manifold S? (or on R with the requirement that gauge 
transformations asymptote to the same value at infinity), gauge transformations are 
characterised by the homotopy group H3(SU(NV)) = Z. The winding is counted by the 
function 


_ 1 
24r? 


n(Q) f BS Ptr (QINANAN) EZ (8.33) 
S 


We recognise this as the final term that appears in the variation of the Chern-Simons 
action. This means that the Chern-Simons action is not invariant under these large 
gauge transformations; it changes as 


k 
Scs > Scs + Tr 247? n(Q) = Scs + 27k n(Q) 


Insisting that the path integral, with its weighing e’°°s is gauge invariant then gives us 
immediately our quantisation condition k € Z. 


Wilson Loops 


We have so far avoided talking about Wilson lines in Chern-Simons theories. There is 
rather a lot to say. We will not describe this in detail here, but just sketch the key 
idea. 


In d = 3 Euclidean spacetime dimensions, a Wilson loop can get tangled. Mathe- 
maticians call closed curves in three dimensions knots, and there has been a great deal 
of effort in trying to classify the ways in which they can get tangled. It turns out that 
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Chern-Simons theories provide one of the most powerful tools. For a given knot C, 
we can compute the Wilson loop (W[C]). In Chern-Simons theory the Wilson loop 
exhibits neither an area law, nor a perimeter law. Instead, it depends on the details 
of the topology of the knot C. For each gauge group G, the Wilson loop gives a topo- 
logical invariant which is a polynomial (roughly in q = e?”/*.) In simple cases, these 
topological invariants coincide with ones already understood by mathematicians (such 
as the Jones polynomial), but they also offer a large number of generalisations. Edward 


Witten was awarded the Fields medal, in large part for understanding this connection. 
8.5 Fermions and Chern-Simons Terms 
There is an intricate interplay between fermions in d = 2 + 1 dimensions and Chern- 


Simons terms. 


In signature 7” = diag(+1, —1, —1), the Clifford algebra {y", 7y”} = 2n"” is satisfied 
by the 2 x 2 gamma matrices, 
Pao, pois , Pio 


The Dirac spinor is then a two-component complex object. In odd spacetime dimen- 


sions, there is no “y5” matrix and, correspondingly, no Weyl fermions. In d = 2+1, we 


can take the gamma matrices as above to be purely imaginary, which means that we can 
have Majorana fermions. However, we won’t have a need for this real representation in 
what follows. 


It will prove useful to understand the action of parity on fermions. As we saw in 
(8.22), in three dimensions parity acts as 


poe , goer , Por 
The Dirac action is then invariant if we take parity to act as 
P: y= gy (8.34) 
But this means that the fermion mass term necessarily breaks parity, 
P : dp = p ry -i 
where, to see this, you need to remember that (y1)' = —7! and (71)? = —1. 


This is different from what happens in d = 3+ 1 dimensions or, indeed, in any 
even spacetime dimension. There parity flips the sign of all spatial dimensions and, 
correspondingly, the Dirac action is invariant if we take P : y =œ y?y. This means that 
in even spacetime dimensions, wy is even under parity; in odd spacetime dimensions 


ww is odd. 
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We can understand why this is by counting degrees of freedom. In d = 3+ 1 
dimensions, the Dirac spinor has 4 components. When we quantise a massive fermion, 
we get two particle states — spin up and spin down — and the same anti-particle states. 
But a Dirac fermion in d = 2+ 1 dimensions has only two components, and so we must 
have half the number of particle states of the d = 3+ 1 theory. The pair that we keep 
is dictated by the sign of the mass, and by CPT invariance: if we have a particle with 
spin, or angular momentum, +5, the theory must also include an anti-particle of spin 
—4. But this necessarily breaks parity: the theory has a particle of spin +4 but no 
particle of spin —4. 


8.5.1 Integrating out Massive Fermions 


Let us take a single Dirac fermion, of mass m, coupled to a U (1) gauge field A,,. The 
action is 


S= / Ba ipPyd + mpy 
If we care about physics at energies below the fermion mass m, we can integrate out 
the fermion. We work in Euclidean space. The fermion then gives a contribution to 
the low-energy effective action for the gauge field, 


Ser = log det (ip + m) = Trlog (i P+ "A, + m) 


We expand this as, 


1 1 1 1 
Seg = Trl j Trl HA — Trl —.— "A, =— A, 
D vlog (r „+3 ros (Sa) “+m )+ 


The first term is an overall constant, and the second term cannot lead to anything gauge 
invariant. But the third term holds something interesting. If we give the background 
field A,, momentum p, then the trace over momenta corresponds to the diagram, 


sua NE zapa) f re i (z E pee) 


Pk ( #rkom kom., 
zap) f (27)? tr (z + k) Em” k2+ me! ) 


where we’ve used the fact that, after the Wick rotation, each gamma matrix squares 
to —1. 
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The trace picks out the non-vanishing gamma matrix structure. There will be a 
contribution to the Maxwell term; that doesn’t interest us here. Instead, we care 
about the term we get when three gamma matrices are multiplied together. The trace 
structure gives 


tr yaya’ = —Delve 


The resulting term is 


oe 7 ak m 
e a = "P A (—p)A (p) A (27)? ((p +k)? + m?) (k2? +m?) 


We’re interested in this integral in the infra-red limit, p — 0, where it is given by 


f dèk m o 1 [ou mk? o 1l 
(27)? (k? + m?)? On? 0 (k2 + m?) E 87|m| 


Putting this together, the 1-loop diagram gives 


1m 
1 — ______- vp — 
_ ae ee ae Im] ° Al p)A,(p) Po 


Back in real space, this gives us the leading term to the low energy effective action 


i sign(m) oe 
Sef = m fa x eP A 0, Ap (8.35) 
There are a number of interesting things to point out about this result. First, the 
effective action comes with a power of 7; this is expected for the Chern-Simons term in 
Euclidean space, and follows from Wick rotating terms with an e symbol. 


Second, and more surprisingly, the fermion does not decouple in the limit m — oo. 
After integrating out a massive field, one typically generates terms in the effective action 
that scale as a power of 1/m. Not so for the Chern-Simons term: it is proportional 
to the sign of the mass. This behaviour holds for fermions in any odd spacetime 
dimensions; we met a similar example in d = 4+ 1 when discussing anomaly inflow in 
Section 4.4.2. 


Finally, and most importantly, the effective action (8.35) is not gauge invariant! It 
is a Chern-Simons term (8.21) with level k = +3. Yet, we saw in the previous section, 
that the Chern-Simons term is only gauge invariant for k € Z. With k = +3, the sign 
of the partition function can flip under gauge transformations. 
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What are we to make of this? It appears that a single massive Dirac fermion, coupled 
to a U(1) gauge field, is inconsistent. This is very much reminiscent of the gauge 
anomalies that we met in d = 3+ 1 dimensions in Section 3. However, we shouldn’t be 
too hasty. After all, anomalies in d = 3+ 1 dimensions were strictly related to massless 
fermions, and here we’re dealing with a massive fermion. What’s going on? 


Indeed, we were sloppy in how we deal with UV divergences in the calculation above. 
They do not arise in the calculation of the Chern-Simons term, but they will surely be 
important if we compute other quantities and, as in any quantum field theory, we need 
a way to regulate them. To achieve this, we introduce a Pauli-Villars regulator field, 
together with suitable counterterms. We take the Pauli-Villars field to have real mass 
Avy > 0. The regulated Dirac determinant is then 

det (iD +m) 
det (iP + Avy) 


This gives two contributions to the Chern-Simons term; one from our fermion, and one 


from the regulator. The effective action for the gauge field then becomes 


det (i +m) ir ee 1 
= —- dr PAOLA 
det (iD + Avy) 20 (senm) 5) f ~ pov f 
which vanishes when m > 0 but gives a Chern-Simons term of level k = —1 when m < 0. 


In other words, when the regulated fermion determinant is defined more carefully, there 
is no problem with gauge invariance. 


The resulting situation is notationally inconvenient. Usually we would like to write 
down an action as shorthand for a quantum field theory, even though we know that to 
fully define the theory really requires a statement about how we regulate. The issue 
above means that the sign of the mass of the Pauli-Villars regulator matters in a crucial 
fashion. To avoid this, we are often sloppy and pretend that we’ve already integrated 
out the Pauli-Villars field to generate a bare Chern-Simons term with level k = -4 in 


the action. 


More generally, we can couple Ny Dirac fermions to a U(1) gauge field with the 
leading terms in the action given by 


Ae? 


Ny 
1 k - - 
S = jez sale Te Aur Ap + J ipi Db; + mibith 
i=l 


Using the convention that the Chern-Simons term already includes the contributions 
from Pauli-Villars fields, gauge invariance requires 


N 
pe EZ 
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This interplay between the level k and the number of fermions is sometimes referred to 
as the parity anomaly. It’s not a great name since the theory with fermion masses is 
not parity invariant to begin with. 


8.5.2 Massless Fermions and the Parity Anomaly 


We can gain a slightly different perspective on the ideas above by considering a massless 
fermion coupled to a U(1) gauge field, A,,. The action is now 


S= f da ip py 


The transformation (8.34) ensures that the classical action is invariant under parity, 
provided that we also act with A, > —Aj. 


The classical action is invariant under parity. But what about the partition function. 
To answer this, we must make sense of the determinant of the Dirac operator, 


Z[A] = det (iD) 


As above, we work in Euclidean space. The Dirac operator is Hermitian, which means 
that it has real eigenvalues, 


iDon = Anon MER 


So formally we can write 
Z=] [à 


Of course, this formula is divergent and so we must work to make sense of it. For 
now, we would like to ask the following question: what is the sign of det(i p). Roughly 
speaking, this must be the difference between the number of negative eigenvalues and 
the number of positive eigenvalues. But, as there are an infinite number of each, it is 
not clear how to count them. 


Why do we care so much about the sign? The problem comes if we try to reconcile a 
given sign with the requirements of gauge invariance. Suppose that we start with some 
gauge configuration A% and decide that det(iJ) has a specific sign. Then it better be 


the case that, for any gauge configuration Aj 
the sign of det(i) remain the same. 


, related to Af, by a gauge transformation, 
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At this point, the discussion may be ringing bells. It is entirely analogous to the 
SU(2) anomaly that we described in Section 3.4.3. We proceed in a very similar way. 
Consider the 1-parameter family of gauge configurations, 


A,,(s;x) = (1 — s) A% (x) + sA} (x) (8.36) 


This has the property that it interpolates from A% when s = 0 to Aj when s = 1. 
The question that we would like to answer is: how many eigenvalues pass through 
zero and change sign as we vary s € [0,1]. To answer this, we can consider the gauge 
configuration A„(s; x) in (8.36) to live on the four manifold J x R’, where I is the 
interval parameterised by 0 < s < 1. 


The number of times that the an eigenvalue crosses zero is given by the index of 
the Dirac operator. This is the object that we introduced in Section 3.3.1 where, on a 
closed four manifold, the Atiyah-Singer index theorem allowed us to write 


1 
Index(iD,,) = eal a d‘x alas ay po 
x 


In 4d, the index counts the difference between the number of left-handed and right- 
handed zero modes. For our purposes, it tells us the difference between the number of 
eigenvalues that switch from positive to negative, and those which switch from negative 
to positive. In other words, under the gauge transformation Ap — Al, the partition 
function of the massless fermion changes as 


Z > Z (—1)neextPaa) 


There is no reason for this index to be even. We see, once again, that without regularisa- 
tion the sign of the partition function can change under a suitable gauge transformation. 


What happens if we now include a regulator? In mathematics, a suitably regulated 
sum of the signs of the eigenvalues of i is known as the Atiyah-Patodi-Singer eta- 
invariant. It is defined by 


n(A) = lim e^ sign(An) 


e30t 


We then define a regulated version of the fermion partition function as 


Z =|det(ip)| eA? 


=A] = 


The 7 invariant depends on the background gauge field A. The Atiyah-Patodi-Singer 
index theorem provides an expression for 7 in terms of the gauge field. If we restrict to 
the generic situation where the gauge field has no zero modes, then one can show that 


1 
mn(A) = — Br eP A ð Ap 


This reproduces the expression that we found previously from the Pauli-Villars regu- 
larisation. In general, the eta-invariant is the more mathematically rigorous way to 
describe what’s happening as it allows one to track what happens as eigenvalues pass 
through zero. 


8.6 3d Bosonization 


In two spacetime dimensions, there is not much of a distinction between bosons and 
fermions. The map between them is known as bosonization and was described in 
Section 7.5. 


In three spacetime dimensions, bosons are not the same as fermions. We can tell 
which one we have in the same way as we would in four dimensions. Given a pair of 
particles we can rotate them by 180°, keeping them well separated. The wavefunction 
for a pair of bosons will come back to itself, while the wavefunction for a pair of fermions 
comes back with a minus sign. 


Nonetheless, it is possible to use Chern-Simons terms to change statistics of an 
excitation from a boson to a fermion. This process is referred to as 3d bosonization. 


8.6.1 Flux Attachment 


To get a feel for what’s going on, it’s useful to first revert to some non-relativistic 
physics. Consider Chern-Simons theory coupled to a current J” 


al 
S= Ja Te Ab Ap + AJ” (8.37) 


We can insert a test particle of unit charge by taking J“ = 67(x). How does the 
gauge field respond? Gauss’ law tells us that the charged particle is accompanied by a 
fractional magnetic flux, 


op an 6” (x) (8.38) 


This is referred to as flux attachment. 
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Now consider two such particles. We will exchange them to determine their quantum 
statistics. The wavefunction will pick up a factor of +1 depending on whether the 
original particles were fermions. However, there is a second contribution to the phase 
of the wavefunction that comes from the Aharonov-Bohm effect. 


Recall that a particle of charge q moving around a flux ® picks up a phase e”%”. 


But because of flux attachment (8.38), the particles carry both charge q = 1 and flux 
© = 27/k. If we move one particle all the way around another, we will get a phase e". 
But the statistical phase is defined by exchanging particles, which consists of only half 
an orbit (followed by a translation which contributes no phase). So, after exchange, 


the expected statistical phase is 
+ piad/2 _ 4 pin/k 


where we take the + sign if our original particles are bosons and the — sign if they 
were fermions. We see that the effect of the Chern-Simons term is to transmute the 
quantum statistics of the particles. In particular, if we take a Chern-Simons term at 
level k = +1, what were bosons become fermions and vice versa. Once again, we see 
that the topological nature of the Chern-Simons term endows it with seemingly magic 
infra-red properties: it can change the behaviour of far separated particles, even though 
it has no propagating degrees of freedom. 


For |k| > 1, the particles are neither bosons nor fermions. Instead they carry frac- 
tional quantum statistics. Such particles are called anyons and are allowed only in 
d = 2+ 1 dimensions. You can read more about them in the lecture notes on the 
Quantum Hall Effect where they play a prominent role. 


A Famously Fiddly Factor of 2 


The calculation above contains an annoying factor of 2 that we’ve swept under the 
rug. Here’s the issue. As the charge q in the first particle moved around the flux ® 
in the second, we picked up a phase e“”®. But you might think that the flux © of the 
first particle also moved around the charge q of the second. So surely this should give 
iq® 


another factor of e’?°. Right? Well, no. To see why, it’s best to just do the calculation. 


For generality, let’s take N particles sitting at positions x,(t) which, as the notation 
shows, we allow to change with time. The charge density and currents are 


J(x,t) =} (x —x,(t)) and J(x,t) = 5 ža 67(x — x,(t)) 
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The equation of motion from (8.37) is 


1 1 
5 Ea = eure” 
and can be easily solved even in this general case. We work in Coulomb gauge with 


Ag = 0 and V- A =0. The solution is then 
N ` x 
1 ig 0 =i) 
A;(x, t) = i xe A (8.39) 


This follows from the standard methods that we know from our Electromagnetism 
lectures, but this time using the Green’s function for the Laplacian in two dimensions: 
V? log |x — y| = 276?(x — y). This solution is again the statement that each particle 
carries flux 1/k. However, we can also use this solution directly to compute the phase 
change when one particle — say, the first one — is transported along a curve C. It is 


exp (if A. ix) 
c 


If the curve C encloses one other particle, the resulting phase change can be computed 
in/k 


simply 


2ri/m 


to be e . As before, if we exchange two particles, we get half this phase, or e 


This, of course, is the same result we got above. 


8.6.2 A Bosonization Duality 


The discussion above shows that Chern-Simons terms can turn bosons into fermions and 
vice-versa. However, it holds only for massive particles, and cannot be easily generalised 
to massless particles, let alone to relativistic quantum field theories. Nonetheless, it 
is suggestive that it may be possible to write down a quantum field theory of bosons 
coupled to Chern-Simons terms that has a dual interpretation in terms of fermions. As 
we now explain, it is thought that this is indeed the case. 


Before we proceed, we’re going to make a small change in notation. In what follows, 
there will be lots of U (1) gauge fields floating around. Some of them will be dynamical, 
while others will be background gauge fields that we couple to currents. To distinguish 
between these, we use the following convention: dynamical gauge fields will be written 
in lower case, e.g. @,. Meanwhile, background gauge fields will be written in upper 
case, e.g. Ay. 
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This convention differs from what we’ve used throughout these lectures, where we 
typically refer to all gauge fields, dynamical or background, as A,. It is, however, 
a standard convention in condensed matter physics where the true electromagnetic 
gauge field A, is typically a background field, describing electric or magnetic fields 
that the experimenter has chosen to turn on. In contrast, 3d dynamical gauge fields 
a, are always emergent excitations, arising from some collective behaviour of strongly 
coupled electrons. 


Consider the following theory, that we refer to as Theory A: a complex scalar field 
coupled to a U(1) gauge field, with Chern-Simons term at level k = 1, 


1 1 À 
neas f de — Poh fl” + aða Dul — ml? — SIGE (8.40) 


4e 

This is the Abelian Higgs model (8.12), but with the addition of a Chern-Simons term. 
Just as before, it is straightforward to analyse in the limits m? >> e? and m? « —e? 
where it is a theory of weakly interacting massive particles. But we’d like to understand 
what happens in the strongly coupled regime. We will argue below that as we vary the 
m? from positive to negative, there is a unique second order phase transition, roughly 
at m = 0. You can think of this gapless theory as the XY critical point, coupled 
to a Chern-Simons gauge field U(1)ı. Below, we will conjecture an alternative, and 
somewhat simpler, description. 


In the infra-red limit e? — oo, the Gauss’ law constraint gives rise to the local flux 
attachment condition, 
fiz 


a scalar = 0 8.41 
gp | Pscal (8.41) 


where Pgcalar 1S the charge density of the scalar field ¢. In the non-relativistic setting — 
which can be invoked when m? > e? — we viewed this as attaching flux to every scalar 
excitation and saw that, for k = 1, this turns a boson into a fermion. In the relativistic 
setting, it turns out to be more appropriate to think of attaching a scalar to every flux. 


To see this, first note that the theory has a conserved global symmetry, with the 
topological current (8.3) 


i ae 
Stop = ra PO ap (8.42) 


We know from our earlier discussion in Section 8.1 that the local operators which carry 
charge under this current are monopole operators M(x), which insert magnetic flux 
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at a point. The flux attachment (8.41) is telling us that, in the presence of a Chern- 
Simons term, these monopole operators are not gauge invariant. We can make them 
gauge invariant only by dressing them with some scalar charge Pgcalar- Schematically, 
we refer to the gauge invariant composite operator as Mø. 


How do we do this less schematically? The right way to proceed is to solve the 
equation of motion for the scalar in the presence of a Dirac monopole. We then treat 
each mode quantum mechanically: the flux attachment condition (8.41) tells us that 
we should excite a single mode. The monopole operator with the lowest dimension will 
correspond to exciting the lowest energy scalar mode. 


We won’t go through this full calculation. However, the key physics can be seen 
from a simple calculation that we did back in Section 1.1: a charged particle moving 
in a minimal Dirac monopole receives a shift of h/2 to its angular momentum. (See, 
in particular, equation (1.9).) This means that exciting any bosonic mode will shift 
the angular momentum of the monopole to become 1/2-integer. But, in a relativistic 
theory, the spin-statistics relation must hold. If our gauge invariant monopole operator 
Mọ has spin 1/2, then it must also be a fermion. 


We see that this argument leads to the same result as before: a bosonic theory coupled 
to a U(1) Chern-Simons gauge field at level k = 1 is really a theory of fermions. The 
obvious question is: what theory of fermions? 


It is conjectured that, close to the critical point, the bosonic theory (8.40) is really 
just a free Dirac fermion! In other words, it can be equivalently described as 


Sal] = [dx ipy- m'i (8.43) 


The map is very similar to that of particle-vortex duality that we saw in Section 8.2.1. 
In particular, the fermion is described by the dressed monopole operator in Theory A, 


Me +> y 
while the U(1) currents map between themselves 
I s 
Jip = z aa t j" = py (8.44) 


Checking the Topological Phases 


Let’s now look for some evidence that this claimed duality is correct. In the case 
of particle-vortex duality, we checked that the theories looked similar in the weakly 
coupled regimes |m?| >> e?. We can try to do something similar here. 
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This is simplest for Theory B. To study the relevant physics, we couple the current 
(8.44) to a background gauge field A,. The partition function for each theory then 
depends on this background field. For Theory B it is 


1 1 
ZB [A] = [re exp (isnt + if ëz iin = STA Ay) 


Note that we are using the convention described in Section 8.5, in which the half-integer 
Chern-Simons term arising from the Pauli-Villars regulator field is shown explicitly in 
the action. We have chosen to add this term with level k = —1/2. 


When the fermions are massive, m’ Æ 0, we can integrate them out and generate an 
effective theory for the background fields A,,. The lowest dimension term is a Chern- 
Simons interaction for A,,, 


k 
Z|A] = exp (iera, +.. (8.45) 


From our discussion in Section 8.5, we know that after integrating out the massive 
fermion w the Chern-Simons level for the background gauge field will be 


0 ‘>0 
(—1+sign(m’)) = k 
-1 m <0 


It may seem odd to write down an action for background fields which don’t fluctuate, 
but there’s important information in the coefficient k: it is the Hall conductivity of the 
topological gapped phase. This follows by using the partition function Z[A] to compute 
the response of the current j” to a background electric field 


, 6 log Z[A] . k 
u(y) = i OE ETE of 
(j (x)) 1 6A, (x) => (Ji) on Eij 4 
You can read (a lot) more about the Hall conductivity in the lectures on the Quantum 
Hall Effect. 


We would like to see how this effect is encoded in the bosonic Theory A. We couple 
the background gauge field A, to the topological current (8.42) to get the partition 
function 


Zaļa] = [Der exp (isalo a] + iet A, 
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where we’re neglecting gauge fixing terms. This time we only have a scalar field, which 
does not shift the level of the Chern-Simons term when integrated out. Nonetheless, 
we can still reproduce the result (8.46) for the Hall conductivity. To see how this 
works, let’s start with the mass m? > e? where, at low energies, the scalar field simply 
decouples, leaving us with the effective action 


1 1 
See |a, A] = / Ba el PayOydy + cP AO,ay 


The equation of motion for the dynamical gauge field a is simply a = —A. Substituting 
this back in, given the effective action (8.45) with k = —1. 


What happens when m? < —e?? In this case the scalar field condenses and the 
dynamical gauge field a becomes gapped. This extra term kills the Hall conductivity, 
leaving us with (8.45) with k = 0. We see that the scalar field does reproduce the 
topological phases of the the fermion theory as promised. This requires the map, 


m <<» -mw => do +> -py 
The agreement between the topological phases is promising, but a long way from demon- 
strating the claimed duality between Theory A (8.40) and the free fermion (8.43). There 
are a number of other routes which lead us to the duality (including large N methods, 
holography, lattice constructions and supersymmetry) but we will not discuss them 
here. Instead we will assume that bosonization duality holds and ask: what can we do 
with it? 


8.6.3 The Beginning of a Duality Web 


We will now show how, starting from the bosonization duality, we can derive further 
equivalences between quantum field theories. First, some conventions. We will revert 
to form notation for the gauge fields, and write the Chern-Simons terms as 


— 1 
a °C, Outs = 7 ade 


i 1 1 
z P A „Orap = 5, Ada = 5, dA 


Both of these are correctly normalised as explained in Section 8.4: they can be added 
to the action only with integer-valued coefficients. We will denote the gauge field under 
which matter is charged by adding a subscript to the covariant derivative like this, 


Dad = 06 — iag 
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The spacetime index on the derivatives will be suppressed. In what follows, the dis- 
tinction between dynamical gauge fields and background gauge fields will be crucial. 
As we mentioned previously, they are distinguished by case. Lower case gauge fields, 
a,b,c,... will always be dynamical; upper case gauge fields A, B,C,... will always be 
background. 


In this notation, we write the 3d bosonization duality that we described above as an 
equivalence between two theories 


1 1 7 11 
2 4 aan eae ; + 
[Dip]? — |Ø + odb + = Adb +> ipay- 57-AdA (8.46) 


Much of this expression is shorthand. First, we have set the mass terms to zero on 
both sides. This really means that we tune to the critical point. On the fermionic side 
this is obvious, but the scalar side includes a |ġ|* term which is taken to mean that we 
flow to the Wilson-Fisher fixed point of the theory, rather than the free fixed point. Of 
course, we don’t literally get to the Wilson-Fisher by simply setting m? = 0; instead 
we must tune m°, or more generally the coefficient of the relevant operator, as we flow 
to the IR to hit the critical point. All of this is buried in the notation above. 


Second, we reiterate that the scalar @ in the above expression is charged under a 
dynamical gauge field, which we have called b to prepare us for some manipulations 
ahead. This means that we integrate over (gauge equivalent) configurations of b in the 
path integral. In contrast, the fermion w is charged under the background field A. We 
can read off the duality map (8.44) between currents by seeing which terms on both 
side are coupled to A. Finally, we’ve omitted nearly all the details of the regularisation 
of the field theory, with one exception: the level —1/2 Chern-Simons term on the right- 
hand-side can be thought of as coming from integrating out a Pauli-Villars regulator. 
This was explained in Section 8.5. (A warning: some places in the literature adopt a 
different convention where this level —1/2 Chern-Simons term remains hidden in the 
regulator.) 


At this point we start to play with these two theories. Both sides of the duality (8.46) 
have a background U(1) gauge field A. The key idea is to promote this to a dynamical 
gauge field. This is misleadingly easy in our notation: we simply write a instead of A. 
As we explained in Section 8.1, gauging a U(1) symmetry in d = 2+ 1 results in a new 
global symmetry, 


1 
j” = z vag 
T 
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We couple this to a background gauge field C. This means that we add 5 AdC to both 
sides of (8.46), and then make A — a dynamical. This results in a new duality, 


IDo? — |o|* + Zig + Em -+ E > iWDw- Sade + = nae 

4r 2i 27 2 4r 2T 
The number of gauge fields on the left-hand side are proliferating. But, at this point, 
something nice happens: the gauge field a only appears linearly in the action. This 
means that it acts as a Lagrange multiplier, setting db = —dC. But, this, in turn, 
freezes the first dynamical gauge field b to be equal, up to gauge connection, to the new 
background field —C. The upshot is that we end up with a scalar field theory with no 
dynamical gauge fields at all, and the duality 


1 


1 _ 1 
Doo? -o+ Cde +> h-z 
T 2 Ar 


1 
da + —adC 8.47 
ada + 57% (8.47) 
This is a new equivalence between two, seemingly very different looking, theories. The 
left-hand-side is something very familiar: it is the XY Wilson-Fisher fixed point. In 
contrast, the right-hand side is the a strongly coupled U(1) gauge theory. The claim is 
that these two fixed points are the same, so 


XY Wilson-Fisher ¢—> U(1)_1/2 coupled to a Dirac fermion 


From our first bosonization duality, we have derived another. Similarly, we can go in 
reverse: starting from the equality of partition functions (8.47), it is not hard to derive 
the original (8.46). 


We can continue in this vein, adding different matter fields and gauging global sym- 
metries, to derive an infinite number of dualities between different 3d Abelian theories 
with Chern-Simons terms. This is referred to as the duality web. Below we give just a 
handful of interesting examples. 


8.6.4 Particle-Vortex Duality Revisited 


Our second bosonization duality (8.47) includes a Chern-Simons coupling for the back- 
ground field C on the left-hand-side. Since we don’t integrate over the background 
field, there is nothing to stop us taking this term onto the other side of the equation. 
We will also take this opportunity to rename some of the variables. The duality (8.47) 
is equivalent to 


1 


AdA A 
T (8.48) 


= Fi 1 
Dae -lot — ipw- sp 0t dA- 
T QT 
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Having moved the background Chern-Simons term to the other side, we now play the 
same game as before: we add a term =~ AdC , and then again promote A to a dynamical 
field, A > a. We now have 

1 
27 
Again, there’s a lot of gauge fields on the right-hand-side. Now a does not appear 


1 - 11 1 1 
IDo — lolt + 5 adc > ipp- zg% ji 5 bda = geda + dC 


linearly as a Lagrange multiplier, but quadratically. Still, it is begging to be integrated 
out by imposing the equation of motion a = b + C, leaving us with 


1 E 11 1 1 
Daol? — lolt + zed + ip pyb + zgod + hdc + Cd (8.49) 


This is still a bosonization duality, relating a scalar theory to a fermionic theory. But 
the right-hand-side is very nearly the same expression that we started with in (8.48), 
but with one important difference: two of the Chern-Simons have their sign flipped. In 
fact, we we send C —> —C, all of the Chern-Simons terms have their sign flipped. In 
other words, this partition function describes the time reversal of the theory in (8.48). 


As we have seen, Chern-Simons terms break time reversal, so one would not naively 
expect that U(1)1/2 coupled to a Dirac fermion is time reversal invariant. However, if 
we take the time reversal of the duality (8.48), we have 


ID c? -let — wD ws L odb ! bd(—C) 4  Gac (8.50) 


By charge conjugation we can replace D_c¢d —> Dod. The left-hand-side is once again 


the XY critical point. It is clearly time-reversal invariant. The duality tells us that 
U(1)i/2 coupled to a massless fermion must be secretly time reversal invariant: it must 
emerge as a discrete symmetry of the quantum theory. 


Combining (8.49) together with (8.50) gives us yet another duality. It is 
1 
Pag? —|dl' + sad > Peol — |e’ 


But this is precisely the statement of particle vortex duality that we discussed in Section 
8.2.1: the left-hand-side is the Abelian Higgs model while the right-hand-side is the 
XY model. We learn that particle-vortex duality = bosonization?. 


8.6.5 Fermionic Particle-Vortex Duality 


Above we have managed to use 3d bosonization to derive a duality between purely 
bosonic theories. We might ask: can we do something similar to derive a duality 
between purely fermionic theories? The answer is yes. But, there will be a new subtlety 
that we have to address. 
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We can see this subtlety by retracing the steps above. To derive bosonic particle- 
vortex duality, we started with the bosonization dual (8.47), moved the background 
Chern-Simons term to the other side, and then promoted the background gauge field 
to a dynamical one. To derive a fermionic particle-vortex duality, it is natural to 
attempt the same manoeuvres for our original bosonization duality (8.46), 


1 1 = 11 
Diol? — |o|* + bab + —Adb <> ipay- 5—AdA (8.51) 
T 2T 2 4T 


But we immediately run into a stumbling block: we can’t move the background Chern- 
Simons term to the other side because it is half-integer valued. It is needed on the 
right-hand-side to ensure that the fermion partition function is gauge invariant. 


To get around this, we will stipulate that the background gauge field A only admits 
flux quantised as 


2 dA € 2Z 
2T 


This is twice the usual requirement. We can then write 
A =2C 


with C a background gauge field whose flux is correctly quantised. The duality (8.51) 
is then 


1 2 - 2 
2 4 Ea A ; o 
[Dg]? — llt + bdb + —Cdbd > ih Dag — z040 (8.52) 


All Chern-Simons terms are now properly quantised. But the fermion on the right- 


hand-side has charge 2 under the gauge field C. If we give a fermion of charge q a mass 
5q°sign(m). 


(This follows from the fact that the one-loop diagram in Section 8.5 has two insertions 


m and integrate it out, it will generate a Chern-Simons term with level 
of the photon-fermion vertex.) So integrating out a fermion of charge 2 generates an 
integer-valued Chern-Simons level and there is no problem with the parity anomaly. 


Now let us play games with this theory. We will move the CdC background Chern- 
Simons term to the other side, add + BdC to both sides, and finally breath life into C 
to make it dynamical, C — a. We have 


1 2 2 1 = 1 
IDo? — |o|* + —bdb + —adb + —ada + —adB + + iD, + —adB 
Ar 2T Ar 2T 2T 


The mess of mixed Chern-Simons terms on the left-hand-side is easily dealt with: we 
simply define the new linear combination 


â=a+b 


Then we find 
* ieee 1 Dace ~ La = 1 
Dio]? — |d|* — —bdb — —bdB + —ada+ —a@dB +> iD, + —adB 
An 20 An 20 27 
But the first four terms in this expression — those which involve ¢ and b — coincide with 


the time-reversal of the left-hand-side of (8.51). We can then use the duality (8.51) to 
replace them, leaving us with the promised fermion-fermion duality, 


= 11 2 1 - 1 
ip pap + —~—AdA+ —âdâ + —âdA +> ippa + —adA 
2 4T At 20 2T 
where we’ve taken this opportunity to rename the background field A. 


What is this final expression telling us? The right-hand-side is a U(1) gauge theory 
coupled to a single Dirac fermion of charge 2. The left-hand-side is very almost a free 
fermion. But it also includes a decoupled topological theory, U(1)2, described by the 
dynamical gauge field a. We learn that 


U(1) with Dirac fermion of charge 2 <¢— Free Dirac fermion + U(1)2 


This is the fermionic version of particle-vortex duality, with the monopole operators 
of the gauge theory identified with the fermion. A closely related duality was first 
suggested by Son in the context of the half-filled Landau level. It has also been invoked 
in the context of topological insulators. 


8.7 Further Reading 


Quantum field theories in d = 2+ 1 dimensions have a rather special relation to the 
real world because, after a Wick rotation, many of them (but not all of them!) can be 
viewed as statistical field theories in d = 3+0 dimensions, where they describe systems 
near critical points. For example, t scalar field theory in d = 3 dimensions describes 
the water boiling in your kettle. (Admittedly, you might need to put a fairly tight lid 
on the kettle.) 


From the high energy perspective, d = 2+ 1 dimensions offer another arena to study 
questions about gauge theories that seemed too challenging in d = 3+ 1. Polyakov’s 
demonstration of confinement [159, 160], driven by the proliferation of instantons 
(monopoles), was a highlight in this regard. Similarly, particle vortex duality was 
first introduced by Peskin [152], in an attempt to see whether a similar duality in 
d = 3 + 1 could help explain confinement. This was subsequently rediscovered in the 
condensed matter community by Dasgupta and Halperin, who also performed numerics 
to find convincing evidence of a second order phase transition [37]. Both of these papers 
originally expressed the duality in terms of lattice theories; the continuum version that 
we described here was first proposed in [61]. 
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Chern-Simons theory was introduced by Deser, Jackiw and Templeton [42, 43], ini- 
tially as a surprising, gauge invariant mechanism to give the three dimensional photon 
amass. The depth of the theory became apparent with Witten’s Fields medal winning 
work on knot invariants [228], and the connection to WZW models [53]. The inter- 
play between massive fermions and Chern-Simons terms was discovered in [149] and 
[168, 169]; a more modern perspective was provided by Witten in [230]. A very clear 
discussion of the properties of Chern-Simons theories can be found in the lectures by 
Dunne [49]. You can read more about the subtleties related to the quantisation of 
Abelian Chern-Simons theories in the appendices of [175] and [176] 


The story of 3d bosonization has a long and complicated history. The idea that one 
can use Chern-Simons terms to transmute the statistics of non-relativistic particles from 
bosons to fermions was pointed out by Wilczek and Zee [211]. Polyakov was the first to 
conjecture that there might be a relativistic version of bosonization, but he missed the 
need to bosonize at the Wilson-Fisher fixed point [161]. The full story came by bringing 
together a wonderfully diverse set of ideas from both high energy and condensed matter 
physics. These include dualities in supersymmetric theories [110], large N bosonization 
and its relation to holography [76, 5, 6, 7], and physics associated to superfluids [13], the 
half-filled Landau level [186] and topological insulators [200, 136]. The web of dualities 
among Abelian gauge theories, relating bosonization and particle-vortex duality, was 
first described [119, 176]. 
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